Recent developments have led to an enormous increase of publicly available large genomic data, in... more Recent developments have led to an enormous increase of publicly available large genomic data, including complete genomes. The 1000 Genomes Project was a major contributor, releasing the results of sequencing a large number of individual genomes, and allowing for a myriad of large scale studies on human genetic variation. However, the tools currently available are insufficient when the goal concerns some analyses of data sets encompassing more than hundreds of base pairs and when considering haplotype sequences of single nucleotide polymorphisms (SNPs). Here, we present a new and potent tool to deal with large data sets allowing the computation of a variety of summary statistics of population genetic data, increasing the speed of data analysis.
Twelve neurological disorders are caused by gene-specific CAG/CTG repeat expansions that are high... more Twelve neurological disorders are caused by gene-specific CAG/CTG repeat expansions that are highly unstable upon transmission to offspring. This intergenerational repeat instability is clinically relevant since disease onset, progression and severity are associated with repeat size. Studies of model organisms revealed the involvement of some DNA replication and repair genes in the process of repeat instability, however, little is known about their role in patients. Here, we used an association study to search for genetic modifiers of (CAG)n instability in 137 parent-child transmissions in Machado-Joseph disease (MJD/SCA3). With the hypothesis that variants in genes involved in DNA replication, repair or recombination might alter the MJD CAG instability patterns, we screened 768 SNPs from 93 genes involved in DNA replication, repair or recombination. We found a variant in ERCC6 (rs2228528) associated with an expansion bias of MJD alleles. When using a gene-gene interaction model, the allele combination G-A (rs4140804-rs2972388) of RPA3-CDK7 is also associated with MJD instability in a direction-dependent manner. Interestingly, the transcription-coupled repair factor ERCC6 (aka CSB), the single-strand binding protein RPA, and the CDK7 kinase part of the TFIIH transcription repair complex, have all been linked to transcriptioncoupled repair. This is the first study performed in patient samples to implicate specific modifiers of CAG instability in humans. In summary, we found variants in three transcription-coupled repair genes associated with the MJD mutation that points to distinct mechanisms of (CAG)n instability. Version: Postprint (identical content as published paper) This is a self-archived document from i3S-Instituto de Investigação e Inovação em Saúde in the University of Porto Open Repository For Open Access to more of our publications, please visit https://repositorio-aberto.up.pt/
Ornithine transcarbamylase (OTC, EC 2.1.3.3) deficiency (OTCD; OMIM #311250) is known to be genet... more Ornithine transcarbamylase (OTC, EC 2.1.3.3) deficiency (OTCD; OMIM #311250) is known to be genetically very heterogeneous, with many cases occurring de novo, due to an exceptional instability of the OTC gene. We report a new G > T substitution in the first nucleotide of intron 2 and we describe also a novel SNP (IVS8 + 35 nt: G > T) with very convenient frequencies (62%/38%) for its use as an extra tool for OTCD diagnosis in cases of suspected deletions.
The first population-genetics study for the world newest independent country, East Timor, is pres... more The first population-genetics study for the world newest independent country, East Timor, is presented. In this preliminary work, part of a major ongoing study on East-Timor genetic diversity, the allele frequencies and some statistical parameters of forensic interest were determined for the 15 loci included in AmpFLSTR Identifilerk genotyping kit. A total of 107 samples, collected from East Timorese of several linguistic groups, was typed. All markers are in Hardy-Weinberg equilibrium (except for D2S1338 and D5S818, but the deviations do not reach significance after Bonferroni correction). Observed heterozigosities varied between 72% (D5S818) and 92% (D8S1179).
Although gene-free areas compose the great majority of eukaryotic genomes, a significant fraction... more Although gene-free areas compose the great majority of eukaryotic genomes, a significant fraction of genes overlaps, i.e., unique nucleotide sequences are part of more than one transcription unit. In this work, the evolutionary history and origin of a same-strand gene overlap is dissected through the analysis of COG8 (component of oligomeric Golgi complex 8) and PDF (peptide deformylase). Comparative genomic surveys reveal that the relative locations of these two genes have been changing over the last 445 million years from distinct chromosomal locations in fish to overlapping in rodents and primates, indicating that the overlap between these genes precedes their divergence. The overlap between the two genes was initiated by the gain of a novel splice donor site between the COG8 stop codon and PDF initiation codon. Splicing is accomplished by the use of the PDF acceptor, leading COG8 to share the 3'end with PDF. In primates, loss of the ancestral polyadenylation signal for COG8 makes the overlap between COG8 and PDF mandatory, while in mouse and rat concurrent overlapping and non-overlapping Cog8 transcripts exist. Altogether, we demonstrate that the origin, evolution and preservation of the COG8/PDF same-strand overlap follow similar mechanistic steps as those documented for antisense overlaps where gain and/or loss of splice sites and polyadenylation signals seems to drive the process.
The level of molecular heterogeneity associated with α1-antitrypsin gene products was assessed in... more The level of molecular heterogeneity associated with α1-antitrypsin gene products was assessed in the population of northern Portugal using three restriction fragment length polymorphisms (RFLPs) corresponding to specific amino acid substitutions and a highly variable (CA)n repeat polymorphism located at the 5′ end of the PI gene. The allelic affinities inferred from the analysis of the DNA polymorphisms essentially agree with the evolutionary pattern proposed for the PI gene products on the basis of their amino acid sequences. PI*Z can be considered the most recent common PI allele and was found to be associated with the same predominant haplotype previously reported in northern European populations, thus confirming the hypothesis that most European Z alleles are derived from a single mutation. However, a rare deficient variant that is the likely result of a recurrent Z mutation on an M2 or M4 background was additionally observed. PI*S was also found to be associated with a strongly predominant haplotype and seems to be the second most recent PI common allele, while M2 and M3 show weaker associations, suggesting more ancient origins of their corresponding mutations. M1Ala213 and M1Val213 display more homogeneous (CA)n allele frequency distributions, M1Ala213 representing the most ancient PI allele as inferred from its highest variance in (CA)n allele length.
For decades, chromosomal inversions have been regarded as fascinating evolutionary elements as th... more For decades, chromosomal inversions have been regarded as fascinating evolutionary elements as they are expected to suppress recombination between chromosomes with opposite orientations, leading to the accumulation of genetic differences between the two configurations over time. Here, making use of publicly available population genotype data for the largest polymorphic inversion in the human genome (8p23-inv), we assessed whether this inhibitory effect of inversion rearrangements led to significant differences in the recombination landscape of two homologous DNA segments, with opposite orientation. Our analysis revealed that the accumulation of genetic differentiation is positively correlated with the variation in recombination profiles. The observed recombination dissimilarity between inversion types is consistent across all populations analyzed and surpasses the effects of geographic structure, suggesting that both structures (orientations) have been evolving independently over an extended period of time, despite being subjected to the very same demographic history. Aside this mainly independent evolution, we also identified a short segment (350 kb, <10% of the whole inversion) in the central region of the inversion where the genetic divergence between the two structural haplotypes is diminished. Although it is difficult to demonstrate it, this could be due to gene flow (possibly via doublecrossing over events), which is consistent with the higher recombination rates surrounding this segment. This study demonstrates for the first time that chromosomal inversions influence the recombination landscape at a fine-scale and highlights the role of these rearrangements as drivers of genome evolution.
Allele and haplotype frequencies of seven Y-chromosome STR loci were determined from a sample of ... more Allele and haplotype frequencies of seven Y-chromosome STR loci were determined from a sample of 95 and 16 unrelated males from Madeira and Porto Santo Islands, respectively.
The c.156_157insAlu BRCA2 mutation has so far only been reported in hereditary breast/ovarian can... more The c.156_157insAlu BRCA2 mutation has so far only been reported in hereditary breast/ovarian cancer (HBOC) families of Portuguese origin. Since this mutation is not detectable using the commonly used screening methodologies and must be specifically sought, we screened for this rearrangement in a total of 5,440 suspected HBOC families from 22 labs from 13 countries from several continents. Whereas the c.156_157insAlu BRCA2 mutation was detected in 11 of 149 suspected HBOC families from Portugal, representing 37.9% of all deleterious mutations, in other countries it was detected only in one proband living in France and in four individuals requesting predictive testing living in France and in the USA, all having in common the fact that they are relatively recent immigrants of Portuguese origin in those countries. After performing an extensive haplotype study in carrier families, we estimate that this founder mutation has occurred 558±215 years ago. We further demonstrate significant quantitative differences regarding the production of the BRCA2 full length RNA and the transcript with exon 3 skipping in c.156_157insAlu BRCA2 mutation carriers and in controls, indicating that disruption of alternative transcript ratios is the mechanism causing hereditary breast/ovarian cancer associated with this BRCA2 rearrangement. We further show that the cumulative incidence of breast cancer in c.156_157insAlu BRCA2 mutation carriers does not differ from that of other BRCA2 and BRCA1 pathogenic mutations, further strengthening its role as the major contributor to hereditary predisposition to breast cancer in Portugal. We recommend that all suspected HBOC families from Portugal or with Portuguese ancestry are specifically tested for this rearrangement, ideally prior to screening of the entire coding regions of BRCA1 and BRCA2.
European journal of human genetics : EJHG, Jan 5, 2016
Understanding the functional sequelae of amino-acid replacements is of fundamental importance in ... more Understanding the functional sequelae of amino-acid replacements is of fundamental importance in medical genetics. Perhaps, the most intuitive way to assess the potential pathogenicity of a given human missense variant is by measuring the degree of evolutionary conservation of the substituted amino-acid residue, a feature that generally serves as a good proxy metric for the functional/structural importance of that residue. However, the presence of putatively compensated variants as the wild-type alleles in orthologous proteins of other mammalian species not only challenges this classical view of amino-acid essentiality but also precludes the accurate evaluation of the functional impact of this type of missense variant using currently available bioinformatic prediction tools. Compensated variants constitute at least 4% of all known missense variants causing human-inherited disease and hence represent an important potential source of error in that they are likely to be disproportionat...
Background: Pyruvate kinase (PK) deficiency, causing hemolytic anemia, has been associated to mal... more Background: Pyruvate kinase (PK) deficiency, causing hemolytic anemia, has been associated to malaria protection and its prevalence in sub-Saharan Africa is not known so far. This work shows the results of a study undertaken to determine PK deficiency occurrence in some sub-Saharan African countries, as well as finding a prevalent PK variant underlying this deficiency. Materials and Methods: Blood samples of individuals from four malaria endemic countries (Mozambique, Angola, Equatorial Guinea and Sao Tome and Principe) were analyzed in order to determine PK deficiency occurrence and detect any possible high frequent PK variant mutation. The association between this mutation and malaria was ascertained through association studies involving sample groups from individuals showing different malaria infection and outcome status. Results: The percentage of individuals showing a reduced PK activity in Maputo was 4.1% and the missense mutation G829A (Glu277Lys) in the PKLR gene (only identified in three individuals worldwide to date) was identified in a high frequency. Heterozygous carrier frequency was between 6.7% and 2.6%. A significant association was not detected between either PK reduced activity or allele 829A frequency and malaria infection and outcome, although the variant was more frequent among individuals with uncomplicated malaria. Conclusions: This was the first study on the occurrence of PK deficiency in several areas of Africa. A common PKLR mutation G829A (Glu277Lys) was identified. A global geographical co-distribution between malaria and high frequency of PK deficiency seems to occur suggesting that malaria may be a selective force raising the frequency of this 277Lys variant.
The history of the Jewish Diaspora dates back to the Assyrian and Babylonian conquests in the Lev... more The history of the Jewish Diaspora dates back to the Assyrian and Babylonian conquests in the Levant, followed by complex demographic and migratory trajectories over the ensuing millennia which pose a serious challenge to unraveling population genetic patterns. Here we ask whether phylogenetic analysis, based on highly resolved mitochondrial DNA (mtDNA) phylogenies can discern among maternal ancestries of the Diaspora. Accordingly, 1,142 samples from 14 different non-Ashkenazi Jewish communities were analyzed. A list of complete mtDNA sequences was established for all variants present at high frequency in the communities studied, along with high-resolution genotyping of all samples. Unlike the previously reported pattern observed among Ashkenazi Jews, the numerically major portion of the non-Ashkenazi Jews, currently estimated at 5 million people and comprised of the Moroccan, Iraqi, Iranian and Iberian Exile Jewish communities showed no evidence for a narrow founder effect, which did however characterize the smaller and more remote Belmonte, Indian and the two Caucasus communities. The Indian and Ethiopian Jewish sample sets suggested local female introgression, while mtDNAs in all other communities studied belong to a well-characterized West Eurasian pool of maternal lineages. Absence of sub-Saharan African mtDNA lineages among the North African Jewish communities suggests negligible or low level of admixture with females of the host populations among whom the African haplogroup (Hg) L0-L3 sub-clades variants are common. In contrast, the North African and Iberian Exile Jewish communities show influence of putative Iberian admixture as documented by mtDNA Hg HV0 variants. These findings highlight striking differences in the demographic history of the widespread Jewish Diaspora.
The polymorphism of a new microsatellite locus (CAI) was investigated in a total of 114 Candida a... more The polymorphism of a new microsatellite locus (CAI) was investigated in a total of 114 Candida albicans strains, including 73 independent clinical isolates, multiple isolates from the same patient, isolates from several episodes of recurrent vulvovaginal infections, and two reference strains. PCR genotyping was performed automatically, using a fluorescence-labeled primer, and in the 73 independent isolates, 26 alleles and 44 different genotypes were identified, resulting in a discriminatory power of 0.97. CAI was revealed to be species specific and showed a low mutation rate, since no amplification product was obtained when testing other pathogenic Candida species and no genotype differences were observed when testing over 300 generations. When applying this microsatellite to the identification of strains isolated from recurrent vulvovaginal infections in eight patients, it was found that 13 out of 15 episodes were due to the same strain. When multiple isolates, obtained from the s...
A tetraplex system for the X chromosome genetic markers DXS7423, DXS101, DXS8377 and HPRTB (human... more A tetraplex system for the X chromosome genetic markers DXS7423, DXS101, DXS8377 and HPRTB (human phosphoribosyl transferase) was optimized in a single PCR reaction. These short tandem repeat (STR) markers were typed for 65 individuals (29 female and 36 male samples) from a Galician population sample (Northern Spain). Allele frequencies were estimated for all loci. Optimization of STR multiplexes is a practical and simple method to obtain large amounts of information important in forensic and population genetic studies, therefore, in this context, they should be the tool of approach for X-STR studies.
Haplotype frequencies were determined for 346 individuals from East Timor and 3 major ethnolingui... more Haplotype frequencies were determined for 346 individuals from East Timor and 3 major ethnolinguistic groups were compared, putting in evidence the need for detailed analyses below the level of the major ethnolinguistic groups. Search of Timorese haplotypes matches in international databases returns few hits and almost all with Philippine data.
The HapYDive is the latest version of a software devised to evaluate the increase of haplotype di... more The HapYDive is the latest version of a software devised to evaluate the increase of haplotype diversity (HD) by the addition of any combination of Y-STRs to a fixed number of markers. Created in Excel format (available at www.ipatimup.pt/app/), it is not only a software for Y-STR HD calculation but, more importantly, it allows the determination of which combination of Y-STRs is the most informative in a certain population sample. With the HapYDive it is possible to analyse any set of Y markers up to a maximum of 20, with a minimum number of 4 markers fixed for calculations. Results on the application of this program to different population samples and sizes and with a certain combination of Y-STR markers are presented and discussed, together with its usefulness mainly to the forensic community.
H4) were analyzed in two Native American populations, namely, Tobas (N = 49) and Collas (N = 29),... more H4) were analyzed in two Native American populations, namely, Tobas (N = 49) and Collas (N = 29), settled in the North and Northwest regions of Argentina, respectively. Standard diversity indices and haplotype frequencies were estimated. Genetic distance between both population was estimated by mean of Fst (Rst) test. Statistical tests were performed using Arlequin software Ver 2.000. Thirtythree and fifteen different complete haplotypes were observed for the Tobas and Collas, respectively. Haplotype diversity was 0.9769 F 0.01 for Tobas and 0.9497 F 0.02 for Collas. A new variant, present in thirteen individuals, was identified at DYS385 loci in Tobas. At DYS448, two alleles were found in two samples from Toba population and in one sample from Collas. No shared haplotypes were found between the two populations. A significant Fst value of 0.1466 was obtained in the pairwise comparison between the two populations (P = 0.00 F 0.0).
The 2004-2005 GEP proficiency testing programs consisted of a simulated paternity case and a simu... more The 2004-2005 GEP proficiency testing programs consisted of a simulated paternity case and a simulated forensic criminal case each including 3-5 reference samples (saliva or blood) and 2 forensic samples (mixed stains and clean or contaminated hair shafts). In the 2004 forensic test a mixture stain was analysed and apparently inconsistent results were observed between autosomal STR profiling and mitochondrial DNA sequencing results. In 2005, the forensic challenge was an unbalanced mixture stain of saliva and blood from two related contributors (sharing maternal and paternal lineages). Due to the stain characteristics, no lab detected the minor component in the mixture. This evidences the fact that the detection of a minor contributor in a mixture is still a key outstanding in forensic investigation. Also hair shafts contaminated with blood have been sent to be analysed and the results showed the influence of the extraction procedures applied. D 2006 Published by Elsevier B.V.
Recent developments have led to an enormous increase of publicly available large genomic data, in... more Recent developments have led to an enormous increase of publicly available large genomic data, including complete genomes. The 1000 Genomes Project was a major contributor, releasing the results of sequencing a large number of individual genomes, and allowing for a myriad of large scale studies on human genetic variation. However, the tools currently available are insufficient when the goal concerns some analyses of data sets encompassing more than hundreds of base pairs and when considering haplotype sequences of single nucleotide polymorphisms (SNPs). Here, we present a new and potent tool to deal with large data sets allowing the computation of a variety of summary statistics of population genetic data, increasing the speed of data analysis.
Twelve neurological disorders are caused by gene-specific CAG/CTG repeat expansions that are high... more Twelve neurological disorders are caused by gene-specific CAG/CTG repeat expansions that are highly unstable upon transmission to offspring. This intergenerational repeat instability is clinically relevant since disease onset, progression and severity are associated with repeat size. Studies of model organisms revealed the involvement of some DNA replication and repair genes in the process of repeat instability, however, little is known about their role in patients. Here, we used an association study to search for genetic modifiers of (CAG)n instability in 137 parent-child transmissions in Machado-Joseph disease (MJD/SCA3). With the hypothesis that variants in genes involved in DNA replication, repair or recombination might alter the MJD CAG instability patterns, we screened 768 SNPs from 93 genes involved in DNA replication, repair or recombination. We found a variant in ERCC6 (rs2228528) associated with an expansion bias of MJD alleles. When using a gene-gene interaction model, the allele combination G-A (rs4140804-rs2972388) of RPA3-CDK7 is also associated with MJD instability in a direction-dependent manner. Interestingly, the transcription-coupled repair factor ERCC6 (aka CSB), the single-strand binding protein RPA, and the CDK7 kinase part of the TFIIH transcription repair complex, have all been linked to transcriptioncoupled repair. This is the first study performed in patient samples to implicate specific modifiers of CAG instability in humans. In summary, we found variants in three transcription-coupled repair genes associated with the MJD mutation that points to distinct mechanisms of (CAG)n instability. Version: Postprint (identical content as published paper) This is a self-archived document from i3S-Instituto de Investigação e Inovação em Saúde in the University of Porto Open Repository For Open Access to more of our publications, please visit https://repositorio-aberto.up.pt/
Ornithine transcarbamylase (OTC, EC 2.1.3.3) deficiency (OTCD; OMIM #311250) is known to be genet... more Ornithine transcarbamylase (OTC, EC 2.1.3.3) deficiency (OTCD; OMIM #311250) is known to be genetically very heterogeneous, with many cases occurring de novo, due to an exceptional instability of the OTC gene. We report a new G > T substitution in the first nucleotide of intron 2 and we describe also a novel SNP (IVS8 + 35 nt: G > T) with very convenient frequencies (62%/38%) for its use as an extra tool for OTCD diagnosis in cases of suspected deletions.
The first population-genetics study for the world newest independent country, East Timor, is pres... more The first population-genetics study for the world newest independent country, East Timor, is presented. In this preliminary work, part of a major ongoing study on East-Timor genetic diversity, the allele frequencies and some statistical parameters of forensic interest were determined for the 15 loci included in AmpFLSTR Identifilerk genotyping kit. A total of 107 samples, collected from East Timorese of several linguistic groups, was typed. All markers are in Hardy-Weinberg equilibrium (except for D2S1338 and D5S818, but the deviations do not reach significance after Bonferroni correction). Observed heterozigosities varied between 72% (D5S818) and 92% (D8S1179).
Although gene-free areas compose the great majority of eukaryotic genomes, a significant fraction... more Although gene-free areas compose the great majority of eukaryotic genomes, a significant fraction of genes overlaps, i.e., unique nucleotide sequences are part of more than one transcription unit. In this work, the evolutionary history and origin of a same-strand gene overlap is dissected through the analysis of COG8 (component of oligomeric Golgi complex 8) and PDF (peptide deformylase). Comparative genomic surveys reveal that the relative locations of these two genes have been changing over the last 445 million years from distinct chromosomal locations in fish to overlapping in rodents and primates, indicating that the overlap between these genes precedes their divergence. The overlap between the two genes was initiated by the gain of a novel splice donor site between the COG8 stop codon and PDF initiation codon. Splicing is accomplished by the use of the PDF acceptor, leading COG8 to share the 3&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;#39;end with PDF. In primates, loss of the ancestral polyadenylation signal for COG8 makes the overlap between COG8 and PDF mandatory, while in mouse and rat concurrent overlapping and non-overlapping Cog8 transcripts exist. Altogether, we demonstrate that the origin, evolution and preservation of the COG8/PDF same-strand overlap follow similar mechanistic steps as those documented for antisense overlaps where gain and/or loss of splice sites and polyadenylation signals seems to drive the process.
The level of molecular heterogeneity associated with α1-antitrypsin gene products was assessed in... more The level of molecular heterogeneity associated with α1-antitrypsin gene products was assessed in the population of northern Portugal using three restriction fragment length polymorphisms (RFLPs) corresponding to specific amino acid substitutions and a highly variable (CA)n repeat polymorphism located at the 5′ end of the PI gene. The allelic affinities inferred from the analysis of the DNA polymorphisms essentially agree with the evolutionary pattern proposed for the PI gene products on the basis of their amino acid sequences. PI*Z can be considered the most recent common PI allele and was found to be associated with the same predominant haplotype previously reported in northern European populations, thus confirming the hypothesis that most European Z alleles are derived from a single mutation. However, a rare deficient variant that is the likely result of a recurrent Z mutation on an M2 or M4 background was additionally observed. PI*S was also found to be associated with a strongly predominant haplotype and seems to be the second most recent PI common allele, while M2 and M3 show weaker associations, suggesting more ancient origins of their corresponding mutations. M1Ala213 and M1Val213 display more homogeneous (CA)n allele frequency distributions, M1Ala213 representing the most ancient PI allele as inferred from its highest variance in (CA)n allele length.
For decades, chromosomal inversions have been regarded as fascinating evolutionary elements as th... more For decades, chromosomal inversions have been regarded as fascinating evolutionary elements as they are expected to suppress recombination between chromosomes with opposite orientations, leading to the accumulation of genetic differences between the two configurations over time. Here, making use of publicly available population genotype data for the largest polymorphic inversion in the human genome (8p23-inv), we assessed whether this inhibitory effect of inversion rearrangements led to significant differences in the recombination landscape of two homologous DNA segments, with opposite orientation. Our analysis revealed that the accumulation of genetic differentiation is positively correlated with the variation in recombination profiles. The observed recombination dissimilarity between inversion types is consistent across all populations analyzed and surpasses the effects of geographic structure, suggesting that both structures (orientations) have been evolving independently over an extended period of time, despite being subjected to the very same demographic history. Aside this mainly independent evolution, we also identified a short segment (350 kb, <10% of the whole inversion) in the central region of the inversion where the genetic divergence between the two structural haplotypes is diminished. Although it is difficult to demonstrate it, this could be due to gene flow (possibly via doublecrossing over events), which is consistent with the higher recombination rates surrounding this segment. This study demonstrates for the first time that chromosomal inversions influence the recombination landscape at a fine-scale and highlights the role of these rearrangements as drivers of genome evolution.
Allele and haplotype frequencies of seven Y-chromosome STR loci were determined from a sample of ... more Allele and haplotype frequencies of seven Y-chromosome STR loci were determined from a sample of 95 and 16 unrelated males from Madeira and Porto Santo Islands, respectively.
The c.156_157insAlu BRCA2 mutation has so far only been reported in hereditary breast/ovarian can... more The c.156_157insAlu BRCA2 mutation has so far only been reported in hereditary breast/ovarian cancer (HBOC) families of Portuguese origin. Since this mutation is not detectable using the commonly used screening methodologies and must be specifically sought, we screened for this rearrangement in a total of 5,440 suspected HBOC families from 22 labs from 13 countries from several continents. Whereas the c.156_157insAlu BRCA2 mutation was detected in 11 of 149 suspected HBOC families from Portugal, representing 37.9% of all deleterious mutations, in other countries it was detected only in one proband living in France and in four individuals requesting predictive testing living in France and in the USA, all having in common the fact that they are relatively recent immigrants of Portuguese origin in those countries. After performing an extensive haplotype study in carrier families, we estimate that this founder mutation has occurred 558±215 years ago. We further demonstrate significant quantitative differences regarding the production of the BRCA2 full length RNA and the transcript with exon 3 skipping in c.156_157insAlu BRCA2 mutation carriers and in controls, indicating that disruption of alternative transcript ratios is the mechanism causing hereditary breast/ovarian cancer associated with this BRCA2 rearrangement. We further show that the cumulative incidence of breast cancer in c.156_157insAlu BRCA2 mutation carriers does not differ from that of other BRCA2 and BRCA1 pathogenic mutations, further strengthening its role as the major contributor to hereditary predisposition to breast cancer in Portugal. We recommend that all suspected HBOC families from Portugal or with Portuguese ancestry are specifically tested for this rearrangement, ideally prior to screening of the entire coding regions of BRCA1 and BRCA2.
European journal of human genetics : EJHG, Jan 5, 2016
Understanding the functional sequelae of amino-acid replacements is of fundamental importance in ... more Understanding the functional sequelae of amino-acid replacements is of fundamental importance in medical genetics. Perhaps, the most intuitive way to assess the potential pathogenicity of a given human missense variant is by measuring the degree of evolutionary conservation of the substituted amino-acid residue, a feature that generally serves as a good proxy metric for the functional/structural importance of that residue. However, the presence of putatively compensated variants as the wild-type alleles in orthologous proteins of other mammalian species not only challenges this classical view of amino-acid essentiality but also precludes the accurate evaluation of the functional impact of this type of missense variant using currently available bioinformatic prediction tools. Compensated variants constitute at least 4% of all known missense variants causing human-inherited disease and hence represent an important potential source of error in that they are likely to be disproportionat...
Background: Pyruvate kinase (PK) deficiency, causing hemolytic anemia, has been associated to mal... more Background: Pyruvate kinase (PK) deficiency, causing hemolytic anemia, has been associated to malaria protection and its prevalence in sub-Saharan Africa is not known so far. This work shows the results of a study undertaken to determine PK deficiency occurrence in some sub-Saharan African countries, as well as finding a prevalent PK variant underlying this deficiency. Materials and Methods: Blood samples of individuals from four malaria endemic countries (Mozambique, Angola, Equatorial Guinea and Sao Tome and Principe) were analyzed in order to determine PK deficiency occurrence and detect any possible high frequent PK variant mutation. The association between this mutation and malaria was ascertained through association studies involving sample groups from individuals showing different malaria infection and outcome status. Results: The percentage of individuals showing a reduced PK activity in Maputo was 4.1% and the missense mutation G829A (Glu277Lys) in the PKLR gene (only identified in three individuals worldwide to date) was identified in a high frequency. Heterozygous carrier frequency was between 6.7% and 2.6%. A significant association was not detected between either PK reduced activity or allele 829A frequency and malaria infection and outcome, although the variant was more frequent among individuals with uncomplicated malaria. Conclusions: This was the first study on the occurrence of PK deficiency in several areas of Africa. A common PKLR mutation G829A (Glu277Lys) was identified. A global geographical co-distribution between malaria and high frequency of PK deficiency seems to occur suggesting that malaria may be a selective force raising the frequency of this 277Lys variant.
The history of the Jewish Diaspora dates back to the Assyrian and Babylonian conquests in the Lev... more The history of the Jewish Diaspora dates back to the Assyrian and Babylonian conquests in the Levant, followed by complex demographic and migratory trajectories over the ensuing millennia which pose a serious challenge to unraveling population genetic patterns. Here we ask whether phylogenetic analysis, based on highly resolved mitochondrial DNA (mtDNA) phylogenies can discern among maternal ancestries of the Diaspora. Accordingly, 1,142 samples from 14 different non-Ashkenazi Jewish communities were analyzed. A list of complete mtDNA sequences was established for all variants present at high frequency in the communities studied, along with high-resolution genotyping of all samples. Unlike the previously reported pattern observed among Ashkenazi Jews, the numerically major portion of the non-Ashkenazi Jews, currently estimated at 5 million people and comprised of the Moroccan, Iraqi, Iranian and Iberian Exile Jewish communities showed no evidence for a narrow founder effect, which did however characterize the smaller and more remote Belmonte, Indian and the two Caucasus communities. The Indian and Ethiopian Jewish sample sets suggested local female introgression, while mtDNAs in all other communities studied belong to a well-characterized West Eurasian pool of maternal lineages. Absence of sub-Saharan African mtDNA lineages among the North African Jewish communities suggests negligible or low level of admixture with females of the host populations among whom the African haplogroup (Hg) L0-L3 sub-clades variants are common. In contrast, the North African and Iberian Exile Jewish communities show influence of putative Iberian admixture as documented by mtDNA Hg HV0 variants. These findings highlight striking differences in the demographic history of the widespread Jewish Diaspora.
The polymorphism of a new microsatellite locus (CAI) was investigated in a total of 114 Candida a... more The polymorphism of a new microsatellite locus (CAI) was investigated in a total of 114 Candida albicans strains, including 73 independent clinical isolates, multiple isolates from the same patient, isolates from several episodes of recurrent vulvovaginal infections, and two reference strains. PCR genotyping was performed automatically, using a fluorescence-labeled primer, and in the 73 independent isolates, 26 alleles and 44 different genotypes were identified, resulting in a discriminatory power of 0.97. CAI was revealed to be species specific and showed a low mutation rate, since no amplification product was obtained when testing other pathogenic Candida species and no genotype differences were observed when testing over 300 generations. When applying this microsatellite to the identification of strains isolated from recurrent vulvovaginal infections in eight patients, it was found that 13 out of 15 episodes were due to the same strain. When multiple isolates, obtained from the s...
A tetraplex system for the X chromosome genetic markers DXS7423, DXS101, DXS8377 and HPRTB (human... more A tetraplex system for the X chromosome genetic markers DXS7423, DXS101, DXS8377 and HPRTB (human phosphoribosyl transferase) was optimized in a single PCR reaction. These short tandem repeat (STR) markers were typed for 65 individuals (29 female and 36 male samples) from a Galician population sample (Northern Spain). Allele frequencies were estimated for all loci. Optimization of STR multiplexes is a practical and simple method to obtain large amounts of information important in forensic and population genetic studies, therefore, in this context, they should be the tool of approach for X-STR studies.
Haplotype frequencies were determined for 346 individuals from East Timor and 3 major ethnolingui... more Haplotype frequencies were determined for 346 individuals from East Timor and 3 major ethnolinguistic groups were compared, putting in evidence the need for detailed analyses below the level of the major ethnolinguistic groups. Search of Timorese haplotypes matches in international databases returns few hits and almost all with Philippine data.
The HapYDive is the latest version of a software devised to evaluate the increase of haplotype di... more The HapYDive is the latest version of a software devised to evaluate the increase of haplotype diversity (HD) by the addition of any combination of Y-STRs to a fixed number of markers. Created in Excel format (available at www.ipatimup.pt/app/), it is not only a software for Y-STR HD calculation but, more importantly, it allows the determination of which combination of Y-STRs is the most informative in a certain population sample. With the HapYDive it is possible to analyse any set of Y markers up to a maximum of 20, with a minimum number of 4 markers fixed for calculations. Results on the application of this program to different population samples and sizes and with a certain combination of Y-STR markers are presented and discussed, together with its usefulness mainly to the forensic community.
H4) were analyzed in two Native American populations, namely, Tobas (N = 49) and Collas (N = 29),... more H4) were analyzed in two Native American populations, namely, Tobas (N = 49) and Collas (N = 29), settled in the North and Northwest regions of Argentina, respectively. Standard diversity indices and haplotype frequencies were estimated. Genetic distance between both population was estimated by mean of Fst (Rst) test. Statistical tests were performed using Arlequin software Ver 2.000. Thirtythree and fifteen different complete haplotypes were observed for the Tobas and Collas, respectively. Haplotype diversity was 0.9769 F 0.01 for Tobas and 0.9497 F 0.02 for Collas. A new variant, present in thirteen individuals, was identified at DYS385 loci in Tobas. At DYS448, two alleles were found in two samples from Toba population and in one sample from Collas. No shared haplotypes were found between the two populations. A significant Fst value of 0.1466 was obtained in the pairwise comparison between the two populations (P = 0.00 F 0.0).
The 2004-2005 GEP proficiency testing programs consisted of a simulated paternity case and a simu... more The 2004-2005 GEP proficiency testing programs consisted of a simulated paternity case and a simulated forensic criminal case each including 3-5 reference samples (saliva or blood) and 2 forensic samples (mixed stains and clean or contaminated hair shafts). In the 2004 forensic test a mixture stain was analysed and apparently inconsistent results were observed between autosomal STR profiling and mitochondrial DNA sequencing results. In 2005, the forensic challenge was an unbalanced mixture stain of saliva and blood from two related contributors (sharing maternal and paternal lineages). Due to the stain characteristics, no lab detected the minor component in the mixture. This evidences the fact that the detection of a minor contributor in a mixture is still a key outstanding in forensic investigation. Also hair shafts contaminated with blood have been sent to be analysed and the results showed the influence of the extraction procedures applied. D 2006 Published by Elsevier B.V.
Due to differences in transmission between X-chromosomal and autosomal DNA, the comparison of dat... more Due to differences in transmission between X-chromosomal and autosomal DNA, the comparison of data derived from both markers allows deeper insight into the forces that shape the patterns of genetic diversity in populations. In this study, we applied this comparative approach to a sample of Portuguese Roma (Gypsies) by analyzing 43 X-chromosomal markers and 53 autosomal markers. Portuguese individuals of non-Gypsy ancestry were also studied. Compared with the host population, reduced levels of diversity on the X chromosome and autosomes were detected in Gypsies; this result was in line with known patterns of genetic diversity typical of Roma groups. As a consequence of the complex demographic past of the Roma, during which admixture and genetic drift played major roles, the amount of linkage disequilibrium (LD) on the X chromosome in Gypsies was considerably higher than that observed in non-Gypsies. When the pattern of differentiation on the X chromosome was compared with that of autosomes, there was evidence for asymmetries in female and male effective population sizes during the admixture between Roma and non-Roma. This result supplements previous data provided by mtDNA and the Y chromosome, underlining the importance of using combined information from the X chromosome and autosomes to dissect patterns of genetic diversity. Following the out-of-India dispersion, the Roma acquired a complex genetic pattern that was influenced by drift and introgression with surrounding populations, with important contributions from both males and females. We provide evidence that a sex-biased admixture with Europeans is probably associated with the founding of the Portuguese Gypsies. Copyright {\textcopyright} 2012 Wiley Periodicals, Inc.
A genetic method to identify the breed of origin could serve as a useful tool for inspecting the ... more A genetic method to identify the breed of origin could serve as a useful tool for inspecting the authenticity of the increasing number of monobreed foodstuffs, such as those derived from small local European pig breeds. Mitochondrial DNA (mtDNA) is practically the only reliable genomic target for PCR in processed products, and its haploid nature and strict maternal inheritance greatly facilitate genetic analysis. As a result of strategies that sought to improve the production traits of European pigs, most industrial breeds presently show a high frequency of Asian alleles, while the absence or low frequency of such Asian alleles has been observed in small rustic breeds from which highly prized dry-cured and other traditional products are derived. Therefore, the detection of Asian ancestry would indicate nonconformity in Protected Denomination of Origin products. This study presents a single base extension assay based on 15 diagnostic mtDNA single nucleotide polymorphisms to discriminate between Asian and European Sus scrofa lineages. The test was robust, sensitive and accurate in a wide range of processed foodstuffs and allowed accurate detection of pig genetic material and identification of maternal ancestry. A market survey suggested that nonconformity of products derived from Portuguese breeds is an unusual event at present, but regular surveys both in the local populations and in commercial products would be advisible. Taking into consideration the limitations presented by other methodologies, this mtDNA-based test probably attains the highest resolution for the direct genetic test for population of origin in Sus scrofa food products. {\textcopyright} 2011 American Chemical Society.
Uploads
Papers by Antonio Amorim