Genome Informatics is an annual conference, focusing on computational approaches for understanding the biology of genomes. It alternates between theWellcome Trust conference centerin Hinxton, UK andCold Spring Harbor Laboratories，NY，USA。去年是辛克斯顿的变故，让我跟着去了，因为我有以前的两次是在英国。
The two keynote presentations were from Katie Pollard (University of California San Francisco, USA) and Rafael Irizarry (Dana-Farber Cancer Institute, Boston, USA). Pollard discussed the use of machine learning in genomics research, and in particular the problems that can arise. She pointed out that you shouldn’t use balanced training data if the problem you are looking at is very unbalanced (ie few positives and many negatives such as identifying promoter sequences); and also that many machine learning models assume that data are independent and identically distributed, but this is very much not the case with genomics data – but nevertheless, even though the assumptions of the model may be violated, useful results can still be obtained.
Now there are more talks discussing the biology revealed by the informatics rather than the informatics methods themselves.
在本次发布会的前几个版本，与会者告诉我它是如何改变了，因为它第一次开始 - 现在有更多的会谈讨论生物学揭示的信息学，而不是信息学方法本身。这是迭代没有什么不同，大约有分析大量癌症基因组中发现的变种，或者个人的基因组发现变种发育障碍相关的大同伙金宝搏体育几次谈话。对于超越试图找出疾病相关变种，斯里兰卡Kosuri（加州大学洛杉矶分校，美国）谈到，他在报告基因构建测试成千上万个SNPs他们对拼接实验。金宝搏体育
One biology talk that I found particularly interesting was from Lucia Spangenberg (Institut Pasteur de Montevideo, Uruguay), who has been attempting to reconstruct the genome of the Charruas, the indigenous people of Uruguay who were exterminated in the 19thcentury. Spangenberg found that the genomes of ten modern-day Uruguayans between them contain enough Charruan DNA to be able to reconstruct 99% of the Charruan genome. In general, people’s native genetic ancestry was higher than their self-reported native identity.
一个趋势特别好奇我们在Genome Biologywas the increased number of methods for representing genomes in a graph format, with variants shown as alternative branches, rather than the traditional linear reference representation. This was described for both prokaryotic genomes (Rachel Colquhoun, Oxford University, UK) and eukaryotic genomes (Prithicka Sritharan, Quadram Institute Bioscience, UK). We found this interesting, as we have been discussing this for a while, and have just issued a call for papers for an文章收集上图的基因组.
I am planning on attending this year’s Genome Informatics conference in Cold Spring Harbor, and it will be fascinating to see how the different location, with a different set of delegates, affects the feel and focus of the conference. However it is different, I predict it will be equally as fascinating as last year’s conference.