Ent identification of recurrent ETS-family PubMed ID:https://www.ncbi.nlm.nih.gov/pubmed/28192408 translocations in prostate cancer [3] and EML4-ALK in lung cancer [4] now suggests that fusion genes may play an important role also in the development of epithelial cancers. The reason why they were not previously detected was the lack of suitable techniques to identify balanced recurrent chromosomal* Correspondence: [email protected] Contributed equally 1 Institute for Molecular Medicine Finland (FIMM), Tukholmankatu 8, Helsinki, 00290, Finland Full list of author information is available at the end of the articleaberrations in the often chaotic karyotypic profiles of solid tumors. Massively parallel RNA-sequencing (RNA-seq) using next-generation sequencing instruments allows identification of gene order (-)-Blebbistatin fusions in individual cancer samples and facilitates comprehensive characterization of cellular transcriptomes [5-11]. Specifically, the new sequencing technologies enable the discovery of chimeric RNA molecules, where the same RNA molecule consists of sequences derived from two physically separated loci. Paired-end RNA-seq, where 36 to 100 bp are sequenced from both ends of 200 to 500 bp long DNA molecules, is especially suitable for identification of such chimeric mRNA transcripts. Whole-genome DNA-sequencing (DNA-seq) can also be used to identify potential fusiongene-creating rearrangements. However, only a fraction of gene fusions predicted based on DNA-seq is expected?2011 Edgren et al.; licensee BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Edgren et al. Genome Biology 2011, 12:R6 http://genomebiology.com/2011/12/1/RPage 2 ofto generate an expressed fusion mRNA, making this approach tedious to discover activated, oncogenic fusion gene events. In contrast, RNA-seq directly identifies only those fusion genes that are expressed, providing an efficient tool to identify candidate oncogenic fusions. In breast cancer, recurrent gene fusions have only been identified in rare subtypes, such as ETV6-NTRK3 in secretory breast carcinoma [12] and MYB-NFIB in adenoid cystic carcinoma of the breast [13]. Here, we demonstrate the effectiveness of paired-end RNA-seq in the comprehensive detection of fusion genes. Combined with a novel bioinformatic strategy, which allowed >95 confirmation rate of the identified fusion events, we identify several novel fusion genes in breast cancer from as little as a single lane of sequencing on an Illumina GA2x instrument. We validate the fusion events and demonstrate their potential biological significance by RT-PCR, fluorescence in situ hybridization (FISH) and RNA interference (RNAi), thereby highlighting the importance of gene fusions in breast cancer.(a)gene Xgene Y(b)True positivesFalse positivesfusion junctionACACA-STACPSMD3-ERBBResultsCriteria for identification of fusion gene candidatesTo detect fusion genes in breast cancer, we performed paired-end RNA-seq using cDNA prepared from four well-characterized cell line models, as well as normal breast, which was used as a control. Between 2 and 14 million filtered short read pairs were obtained per sample for each lane of an Illumina Genome Analyzer II flow cell (Additional file 1). We discarded all fusion candidates consisting of two overlapping or adjacent genes as li.