SlideShare uma empresa Scribd logo
1 de 37
Low Duplicability and Network Fragility of Cancer Genes Davide Rambaldi
Background and Aim of the Project high heterogeneity and high number (~600) of genes mutated in cancer Identification of Systems-level properties Better understanding of the genetic determinants of cancer progression  Identification of candidate cancer genes
Choice of Systems-level properties Genomic Duplicability Tendency to retain conserved and/or recent duplicates Network topology Position of the protein in a protein-protein interaction network Duplicability (Zhang, 2006) (Sun, 2006) Network connectivity (Wu, 2005) ( Prachumwat, 2006)   fragility  (Veitia, 2002)
~ 600 genes mutated in cancer ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Detection of Genomic Duplicates reference set  N=349 benchmark set  N=254 reference set 83.68% 16.32% benchmark set 10.3% 89.7% reference set benchmark set
Example of duplicable gene:  rara RARA -  RETINOIC ACID RECEPTOR ALPHA First duplication:  Coverage  68% Second duplication:  Coverage  65% Best Hit:  Coverage  99% Spurious Hit:  Coverage  9%
Do Cancer and CAN-genes duplicate more or less than the rest of human genes? Reference Set Benchmark Set Comparison to other human genes 83.7% Singletons 16.3% Duplicable genes % 89.7% Singletons 10.3% Duplicable genes %
Comparison to other human genes Human genes = 24.202
Genes mutated in cancer tend to duplicate less than other human genes Reference Set Benchmark Set
Is this really a systems-level property? Human genes = 24.202
Genes mutated in cancer duplicate less than other human genes with the same functional distribution
From Genomes to Network ,[object Object],[object Object],[object Object],Does this apply also for cancer genes? Duplicability Network connectivity fragility
Human Interaction Network ,[object Object],[object Object],[object Object],[object Object],[object Object],154/254 Benchmark set 24% Duplicable proteins 304/349 Reference set 76% Singletons 34564 edges (interactions) 9264 nodes (proteins)
Resulting Network
Network Analysis Global Topology DEGREE (d) Measure of connectivity of each node CLUSTERING COEFFICIENT (cc) Measure of interconnectivity of each node d=4 cc=0 d=4 cc=0.3
global topology Scale free network : few nodes with many connections, many nodes with few connections (Barabási and Albert,1999)
How do cancer genes behave in the network? Duplicability Network connectivity fragility
Global topology of singleton and duplicable proteins In the entire network, singletons proteins are  less  connected than duplicable proteins but have an  higher  clustering coefficient P < 0.0001 (Wilcoxon Test) P = 0.0163 (Wilcoxon Test) singleton duplicable
Global topology of cancer proteins P < 0.0001 (Wilcoxon Test) P < 0.0001 (Wilcoxon Test) Unlike most singletons, proteins mutated in cancer are  more  connected than other proteins and have an  higher  clustering coefficient singleton cancer
Local Topology Measure the enrichment of subgraphs in the network  ,[object Object],[object Object],[object Object],We analyzed 3-nodes and 4-nodes subgraphs 3-nodes 4-nodes
Local Topology of the entire network The human network  is enriched  in the most interconnected subgraphs.
Local topology of duplicable and singleton proteins No significant difference  between singleton and duplicable proteins in the network motifs.
Local topology of cancer and CAN-proteins
Summary Singletons are  less  connected but  more  interconnected than duplicable proteins Cancer genes, mainly singletons, code for  protein HUBS of highly interconnected modules   of the human network Singletons and duplicable proteins  are equally  represented in the network motifs BUT In the entire network:
Data interpretation ,[object Object],[object Object],[object Object],[object Object],[object Object],~94% of the entire network ~6% of the entire network
Data interpretation Duplicability Network connectivity fragility candidates ,[object Object],[object Object],[object Object],[object Object],[object Object]
Possible candidates 101 singletons genes with >20 connections and cc>0.1 Significantly enriched in Gene Ontology terms related to cancer
Network of candidate cancer genes
Network of Cancer genes (developed by Federico Giorgi) http://bio.ifom-ieo-campus.it/ncg/
Many thanks to … Ciccarelli Group Francesca Ciccarelli Anna DeGrassi Federico Giorgi Matteo Dantonio Ciliberto Group Andrea Ciliberto Fabrizio Capuani Romilde Manzoni Federico Vaggi And all the bioinfo crew … Statistics Giovanni d’Ario Lara Lusa IT support Davide Cittaro
[object Object],[object Object],[object Object],[object Object],Duplicates definition: 60% coverage
RARA and NR2C2 RARA NR2C2
A singleton gene: FEV FEV -  ETS oncogene family (coverage= 100%  identity= 100% ) (coverage= 35%  identity= 86% )
Changing threshold Changing the threshold of 10% doesn’t change the results: our observation are independent from the chosen coverage threshold value
Is this signal real? EXIST A CORRELATION BETWEEN  CONNECTIVITY  IN HPRD AND  ABSTRACTS  IN PUBMED? HOW IS THE  CONNECTIVITY  OF CANCER PROTEINS USING ONLY INTERACTIONS COMING FROM  HIGH-THROUGHPUT  EXPERIMENTS? HPRD is a database based on literature:  is it biased towards well-studied genes?  (… and cancer genes are among them)
Network Randomization Real Network Edges Randomization
Network of Cancer genes: public access to our data (developed by F.M. Giorgi)

Mais conteúdo relacionado

Mais procurados

Level of Tumor Protein Indicates Chances Cancer Will Spread AND Malfunctionin...
Level of Tumor Protein Indicates Chances Cancer Will Spread AND Malfunctionin...Level of Tumor Protein Indicates Chances Cancer Will Spread AND Malfunctionin...
Level of Tumor Protein Indicates Chances Cancer Will Spread AND Malfunctionin...
espontanea
 
DNA Technology
DNA TechnologyDNA Technology
DNA Technology
mgsonline
 
Kurrey_et_al-2009-STEM_CELLS
Kurrey_et_al-2009-STEM_CELLSKurrey_et_al-2009-STEM_CELLS
Kurrey_et_al-2009-STEM_CELLS
Swati Jalgaonkar
 
Haendel clingenetics.3.14.14
Haendel clingenetics.3.14.14Haendel clingenetics.3.14.14
Haendel clingenetics.3.14.14
mhaendel
 
Cancer genome databases & Ecological databases
Cancer genome databases & Ecological databases Cancer genome databases & Ecological databases
Cancer genome databases & Ecological databases
Waliullah Wali
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
hemantbreeder
 

Mais procurados (20)

Level of Tumor Protein Indicates Chances Cancer Will Spread AND Malfunctionin...
Level of Tumor Protein Indicates Chances Cancer Will Spread AND Malfunctionin...Level of Tumor Protein Indicates Chances Cancer Will Spread AND Malfunctionin...
Level of Tumor Protein Indicates Chances Cancer Will Spread AND Malfunctionin...
 
Genomics seminar
Genomics seminarGenomics seminar
Genomics seminar
 
NetBioSIG2013-Talk Thomas Kelder
NetBioSIG2013-Talk Thomas KelderNetBioSIG2013-Talk Thomas Kelder
NetBioSIG2013-Talk Thomas Kelder
 
Genomics
GenomicsGenomics
Genomics
 
NetBioSIG2014-Talk by David Amar
NetBioSIG2014-Talk by David AmarNetBioSIG2014-Talk by David Amar
NetBioSIG2014-Talk by David Amar
 
Cancer genome
Cancer genomeCancer genome
Cancer genome
 
DNA Technology
DNA TechnologyDNA Technology
DNA Technology
 
Human genome project
Human genome projectHuman genome project
Human genome project
 
Cancer and CNV
Cancer and CNVCancer and CNV
Cancer and CNV
 
NGS in cancer treatment
NGS in cancer treatmentNGS in cancer treatment
NGS in cancer treatment
 
Microbial Metagenomics Drives a New Cyberinfrastructure
Microbial Metagenomics Drives a New CyberinfrastructureMicrobial Metagenomics Drives a New Cyberinfrastructure
Microbial Metagenomics Drives a New Cyberinfrastructure
 
High-Throughput Sequencing
High-Throughput SequencingHigh-Throughput Sequencing
High-Throughput Sequencing
 
Kurrey_et_al-2009-STEM_CELLS
Kurrey_et_al-2009-STEM_CELLSKurrey_et_al-2009-STEM_CELLS
Kurrey_et_al-2009-STEM_CELLS
 
Haendel clingenetics.3.14.14
Haendel clingenetics.3.14.14Haendel clingenetics.3.14.14
Haendel clingenetics.3.14.14
 
Cancer genome databases & Ecological databases
Cancer genome databases & Ecological databases Cancer genome databases & Ecological databases
Cancer genome databases & Ecological databases
 
Reconstruction and analysis of cancerspecific Gene regulatory networks from G...
Reconstruction and analysis of cancerspecific Gene regulatory networks from G...Reconstruction and analysis of cancerspecific Gene regulatory networks from G...
Reconstruction and analysis of cancerspecific Gene regulatory networks from G...
 
Bioinformatics as a tool for understanding clinically significant variations ...
Bioinformatics as a tool for understanding clinically significant variations ...Bioinformatics as a tool for understanding clinically significant variations ...
Bioinformatics as a tool for understanding clinically significant variations ...
 
20140711 5 s_pond_ercc2.0_workshop
20140711 5 s_pond_ercc2.0_workshop20140711 5 s_pond_ercc2.0_workshop
20140711 5 s_pond_ercc2.0_workshop
 
Genetic mapping
Genetic mappingGenetic mapping
Genetic mapping
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 

Destaque

Laboratorio Probabilidad 1/3
Laboratorio Probabilidad 1/3Laboratorio Probabilidad 1/3
Laboratorio Probabilidad 1/3
cbpresentaciones
 
Scheduling power-aware abstract
Scheduling power-aware abstractScheduling power-aware abstract
Scheduling power-aware abstract
Vincenzo De Maio
 
Introduzione a R
Introduzione a RIntroduzione a R
Introduzione a R
MCalderisi
 
Introduction to R by David Lucy Cap 12-16
Introduction to R by David Lucy Cap 12-16Introduction to R by David Lucy Cap 12-16
Introduction to R by David Lucy Cap 12-16
Luis Pons
 
Ejercicios resueltos en r
Ejercicios resueltos en rEjercicios resueltos en r
Ejercicios resueltos en r
zasque11
 
Narrative codes
Narrative codesNarrative codes
Narrative codes
ATith
 

Destaque (20)

R Vectors
R VectorsR Vectors
R Vectors
 
R Graphics
R GraphicsR Graphics
R Graphics
 
Laboratorio Probabilidad 1/3
Laboratorio Probabilidad 1/3Laboratorio Probabilidad 1/3
Laboratorio Probabilidad 1/3
 
Linguaggio R, principi e concetti
Linguaggio R, principi e concettiLinguaggio R, principi e concetti
Linguaggio R, principi e concetti
 
R_note_ODE_ver1.0
R_note_ODE_ver1.0R_note_ODE_ver1.0
R_note_ODE_ver1.0
 
Elisa Teodoro, Clase 5, Funciones
Elisa Teodoro, Clase 5, FuncionesElisa Teodoro, Clase 5, Funciones
Elisa Teodoro, Clase 5, Funciones
 
Ruby es un lenguaje de programación interpretado
Ruby es un lenguaje de programación interpretadoRuby es un lenguaje de programación interpretado
Ruby es un lenguaje de programación interpretado
 
Scheduling power-aware abstract
Scheduling power-aware abstractScheduling power-aware abstract
Scheduling power-aware abstract
 
Introduzione a R
Introduzione a RIntroduzione a R
Introduzione a R
 
Elisa Teodoro, Aplicacion de Derivadas, Clase 2
Elisa Teodoro, Aplicacion de Derivadas, Clase 2Elisa Teodoro, Aplicacion de Derivadas, Clase 2
Elisa Teodoro, Aplicacion de Derivadas, Clase 2
 
Programacion en R
Programacion en RProgramacion en R
Programacion en R
 
Abstract tesi
Abstract tesiAbstract tesi
Abstract tesi
 
Introduction to R by David Lucy Cap 12-16
Introduction to R by David Lucy Cap 12-16Introduction to R by David Lucy Cap 12-16
Introduction to R by David Lucy Cap 12-16
 
ECUACIONES DIFERENCIALES CON DERIVE
ECUACIONES DIFERENCIALES CON DERIVEECUACIONES DIFERENCIALES CON DERIVE
ECUACIONES DIFERENCIALES CON DERIVE
 
9 introduzione r
9   introduzione r9   introduzione r
9 introduzione r
 
Ejercicios resueltos en r
Ejercicios resueltos en rEjercicios resueltos en r
Ejercicios resueltos en r
 
An introduction to structural equation models in R using the Lavaan package
An introduction to structural equation models in R using the Lavaan packageAn introduction to structural equation models in R using the Lavaan package
An introduction to structural equation models in R using the Lavaan package
 
Apuntes de prácticas de DERIVE
Apuntes de prácticas de DERIVEApuntes de prácticas de DERIVE
Apuntes de prácticas de DERIVE
 
LENGUAJE DE PROGRAMACION R
LENGUAJE DE PROGRAMACION RLENGUAJE DE PROGRAMACION R
LENGUAJE DE PROGRAMACION R
 
Narrative codes
Narrative codesNarrative codes
Narrative codes
 

Semelhante a PhD midterm report

OKC Grand Rounds 2009
OKC Grand Rounds 2009OKC Grand Rounds 2009
OKC Grand Rounds 2009
Sean Davis
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
Ian Foster
 

Semelhante a PhD midterm report (20)

OKC Grand Rounds 2009
OKC Grand Rounds 2009OKC Grand Rounds 2009
OKC Grand Rounds 2009
 
Genomics Technologies
Genomics TechnologiesGenomics Technologies
Genomics Technologies
 
10.1.1.80.2149
10.1.1.80.214910.1.1.80.2149
10.1.1.80.2149
 
Bioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesisBioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesis
 
Developing a framework for for detection of low frequency somatic genetic alt...
Developing a framework for for detection of low frequency somatic genetic alt...Developing a framework for for detection of low frequency somatic genetic alt...
Developing a framework for for detection of low frequency somatic genetic alt...
 
Bioinformatic Analysis of Synthetic Lethality in Breast Cancer
Bioinformatic Analysis of Synthetic Lethality in Breast CancerBioinformatic Analysis of Synthetic Lethality in Breast Cancer
Bioinformatic Analysis of Synthetic Lethality in Breast Cancer
 
Efficiency of Using Sequence Discovery for Polymorphism in DNA Sequence
Efficiency of Using Sequence Discovery for Polymorphism in DNA SequenceEfficiency of Using Sequence Discovery for Polymorphism in DNA Sequence
Efficiency of Using Sequence Discovery for Polymorphism in DNA Sequence
 
Visual Exploration of Clinical and Genomic Data for Patient Stratification
Visual Exploration of Clinical and Genomic Data for Patient StratificationVisual Exploration of Clinical and Genomic Data for Patient Stratification
Visual Exploration of Clinical and Genomic Data for Patient Stratification
 
Prediction of protein function
Prediction of protein functionPrediction of protein function
Prediction of protein function
 
STRING - Prediction of a functional association network for the yeast mitocho...
STRING - Prediction of a functional association network for the yeast mitocho...STRING - Prediction of a functional association network for the yeast mitocho...
STRING - Prediction of a functional association network for the yeast mitocho...
 
How to transform genomic big data into valuable clinical information
How to transform genomic big data into valuable clinical informationHow to transform genomic big data into valuable clinical information
How to transform genomic big data into valuable clinical information
 
A New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The ClinicA New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The Clinic
 
From reads to pathways for efficient disease gene finding
From reads to pathways for efficient disease gene findingFrom reads to pathways for efficient disease gene finding
From reads to pathways for efficient disease gene finding
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
 
GENOME DATA ANALYSIS
GENOME DATA ANALYSISGENOME DATA ANALYSIS
GENOME DATA ANALYSIS
 
Metagenomics and it’s applications
Metagenomics and it’s applicationsMetagenomics and it’s applications
Metagenomics and it’s applications
 
CAMERA Presentation at KNAW ICoMM Colloquium May 2008
CAMERA Presentation at KNAW ICoMM Colloquium May 2008CAMERA Presentation at KNAW ICoMM Colloquium May 2008
CAMERA Presentation at KNAW ICoMM Colloquium May 2008
 
Pre-clinical drug prioritization via prognosis-guided genetic interaction net...
Pre-clinical drug prioritization via prognosis-guided genetic interaction net...Pre-clinical drug prioritization via prognosis-guided genetic interaction net...
Pre-clinical drug prioritization via prognosis-guided genetic interaction net...
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
 
metagenomicsanditsapplications-161222180924.pdf
metagenomicsanditsapplications-161222180924.pdfmetagenomicsanditsapplications-161222180924.pdf
metagenomicsanditsapplications-161222180924.pdf
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Último (20)

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

PhD midterm report

  • 1. Low Duplicability and Network Fragility of Cancer Genes Davide Rambaldi
  • 2. Background and Aim of the Project high heterogeneity and high number (~600) of genes mutated in cancer Identification of Systems-level properties Better understanding of the genetic determinants of cancer progression Identification of candidate cancer genes
  • 3. Choice of Systems-level properties Genomic Duplicability Tendency to retain conserved and/or recent duplicates Network topology Position of the protein in a protein-protein interaction network Duplicability (Zhang, 2006) (Sun, 2006) Network connectivity (Wu, 2005) ( Prachumwat, 2006) fragility (Veitia, 2002)
  • 4.
  • 5. Detection of Genomic Duplicates reference set N=349 benchmark set N=254 reference set 83.68% 16.32% benchmark set 10.3% 89.7% reference set benchmark set
  • 6. Example of duplicable gene: rara RARA - RETINOIC ACID RECEPTOR ALPHA First duplication: Coverage 68% Second duplication: Coverage 65% Best Hit: Coverage 99% Spurious Hit: Coverage 9%
  • 7. Do Cancer and CAN-genes duplicate more or less than the rest of human genes? Reference Set Benchmark Set Comparison to other human genes 83.7% Singletons 16.3% Duplicable genes % 89.7% Singletons 10.3% Duplicable genes %
  • 8. Comparison to other human genes Human genes = 24.202
  • 9. Genes mutated in cancer tend to duplicate less than other human genes Reference Set Benchmark Set
  • 10. Is this really a systems-level property? Human genes = 24.202
  • 11. Genes mutated in cancer duplicate less than other human genes with the same functional distribution
  • 12.
  • 13.
  • 15. Network Analysis Global Topology DEGREE (d) Measure of connectivity of each node CLUSTERING COEFFICIENT (cc) Measure of interconnectivity of each node d=4 cc=0 d=4 cc=0.3
  • 16. global topology Scale free network : few nodes with many connections, many nodes with few connections (Barabási and Albert,1999)
  • 17. How do cancer genes behave in the network? Duplicability Network connectivity fragility
  • 18. Global topology of singleton and duplicable proteins In the entire network, singletons proteins are less connected than duplicable proteins but have an higher clustering coefficient P < 0.0001 (Wilcoxon Test) P = 0.0163 (Wilcoxon Test) singleton duplicable
  • 19. Global topology of cancer proteins P < 0.0001 (Wilcoxon Test) P < 0.0001 (Wilcoxon Test) Unlike most singletons, proteins mutated in cancer are more connected than other proteins and have an higher clustering coefficient singleton cancer
  • 20.
  • 21. Local Topology of the entire network The human network is enriched in the most interconnected subgraphs.
  • 22. Local topology of duplicable and singleton proteins No significant difference between singleton and duplicable proteins in the network motifs.
  • 23. Local topology of cancer and CAN-proteins
  • 24. Summary Singletons are less connected but more interconnected than duplicable proteins Cancer genes, mainly singletons, code for protein HUBS of highly interconnected modules of the human network Singletons and duplicable proteins are equally represented in the network motifs BUT In the entire network:
  • 25.
  • 26.
  • 27. Possible candidates 101 singletons genes with >20 connections and cc>0.1 Significantly enriched in Gene Ontology terms related to cancer
  • 28. Network of candidate cancer genes
  • 29. Network of Cancer genes (developed by Federico Giorgi) http://bio.ifom-ieo-campus.it/ncg/
  • 30. Many thanks to … Ciccarelli Group Francesca Ciccarelli Anna DeGrassi Federico Giorgi Matteo Dantonio Ciliberto Group Andrea Ciliberto Fabrizio Capuani Romilde Manzoni Federico Vaggi And all the bioinfo crew … Statistics Giovanni d’Ario Lara Lusa IT support Davide Cittaro
  • 31.
  • 32. RARA and NR2C2 RARA NR2C2
  • 33. A singleton gene: FEV FEV - ETS oncogene family (coverage= 100% identity= 100% ) (coverage= 35% identity= 86% )
  • 34. Changing threshold Changing the threshold of 10% doesn’t change the results: our observation are independent from the chosen coverage threshold value
  • 35. Is this signal real? EXIST A CORRELATION BETWEEN CONNECTIVITY IN HPRD AND ABSTRACTS IN PUBMED? HOW IS THE CONNECTIVITY OF CANCER PROTEINS USING ONLY INTERACTIONS COMING FROM HIGH-THROUGHPUT EXPERIMENTS? HPRD is a database based on literature: is it biased towards well-studied genes? (… and cancer genes are among them)
  • 36. Network Randomization Real Network Edges Randomization
  • 37. Network of Cancer genes: public access to our data (developed by F.M. Giorgi)

Notas do Editor

  1. Hello, my names is Davide Rambaldi and I work in the Bioinformatics and Evolutionary genomics of cancer group. I will present the results of my first 2 year of PhD. In this 2 years I focused on the analysis of human genes mutated in cancer and today I will talk of their properties at the genomic level and in the context of a protein-protein interaction network.