Introduction
On July 17th 2014 a Malaysian airplane crashed near Hrabove, Ukraine. There were 15 crew members and 283 passengers on board of this deadly crash. During this time the plane was flying over a warzone, the suspicion therefor is that the plane might have been hit by a missile as part of a terrorist attack. Researchers must investigate the cause of the crash of the MH17 flight.
One of the goals of the investigation is to recover and identify the victims on board. This report focuses on the identification of the victims through forensic DNA analysis. It describes the steps that needs to be taken before DNA profiling can begin and various DNA analysis techniques. Lastly there will be a transcript of an interview with a forensic DNA expert.
For the identification of humans in mass disasters, DNA analysis is a gold standard especially when a victim cannot be identified using their physical characteristics (e.g. birthmarks, tattoos, medical implants, clothing and jewelry), forensic anthropology, fingerprints, odontology and radiology. Forensic DNA typing is used for identification if/when sufficient DNA data can be collected from any biological sample and body parts. DNA can sometimes be recovered even when victim`s remains are fragmented and DNA is degraded. DNA analysis is also often used for the re-association of severely fragmented remains with the victims, because this is the only technique that allows it. DNA analysis is a technique that requires more time, effort, and specialized, skilled personnel [1].
Reference samples
In order to make a positive identification, reference samples from possible victims are collected so the DNA samples can be compared to each other. Reference samples can be collected from several different sources. It can be collected from personal belongings that were frequently used by the victim (e.g. toothbrush, hairbrush, razor, unwashed undergarments). Banked samples can also be used. Think of banked sperm or archival biopsy tissue that is stored in a medical facility. Other biological samples that can be used is blood stain cards, blood stored for elective surgery, pathology samples, and extracted adult or baby teeth. Another source for a reference sample is via biological relatives of the victim that are collected using buccal swaps [1].
The reference samples can be categorized by the DNA quality of the DNA that is available on the samples and they are good sources of DNA, fair source of DNA and poor sources of DNA (see table 1). [2].
Table 1: Categorization of reference samples in good, fair and poor sources of DNA[2]
Good sources of DNA Fair sources of DNA Poor sources of DNA
Tooth brushes Lipstick Jewelry
Razors Deodorant sticks Wrist watches
Hair brushes Pillow cases Outer clothing
Bone marrow samples Used cups/glasses Towels
Blood cards from PKU screening Used underwear Shoes
National bio banks Fingernail clippinds Hair bands
Criminal databases Cigarette buds Baby hair
Serum samples Pipe Dentures
Sperm bank samples Mouth piece/guard Hair rollers
Dried umbilical cords Helmets/caps/hats Trimmers
Pathology speciman Earplugs
Eye glasses
Inner clothing items (bra, socks, t-shirts)
Pen with teeth marks
Mailed envelopes/postcards
There is several preferred combination of family reference samples. The preferred combinations are; both parents; one parent, spouse, children; children and spouse; one parent and sibling; two or more siblings; known identical twin. From all the combinations possible for the kinship analysis, the probability for their effectiveness has been calculated. The three lowest combinations are one full sibling (92.1%), sibling and aunt (94.4%) and sibling and two aunts/uncles from same side of the family (97.8%). The three highest combinations are three grandparents (96.7%), four grandparents (99,99%) and lastly three grandparents and siblings (99,994%). The full list is shown in table 2. [3]
Table 2: Probability of identity using various combinations of family references [2]
Family references Probability of identity (%)
One full sibling 92.1
Sibling and aunt (or uncle) 94.4
Sibling and two aunts (or uncles) from same side of the family 97.8
Sibling, aunt, uncle from different sides of the family 99.8
Sibling and half sibling 98
Sibling and two half siblings (same mother) 99.4
Two siblings 99.91
One parent 99.9
Sibling and parent 99.996
Father and one maternal half sibling 99.95
Father and two maternal half siblings 99.996
Father and maternal aunt 99.993
Three grandparent 96.7
Four grandparents 99.99
Three grandparents and sibling 99.994
Previously identified fragmented remains that were already identified using DNA can also be used to make identification [1].
Limitations
There are several limitation as to the retrieval of remain and reference samples. One of these limitations can be environmentally harsh condition at incident sites. This can severely limit the quantity of usable DNA from human remains. Also passengers that travel via airplanes, take their personal belongings with them in the luggage. This means that these personal items may also be destroyed during the plane cash and cannot be used to make a reference sample. In some cases using DNA from a close relative can also be difficult. Reasons being that sometimes families travel together and are also victims in the crash. Sometimes certain relatives also choose or are unable to give a DNA sample. It is also important to establish that personal items collected was only used by the victim in order to avoid getting a mixed profile or giving a wrong identification. [1].
Parameters of DNA identification
The importance of DNA in the identification process is dependent on the degree in which the human remain is fragmented or degraded. Sometimes multiple methods are used in order to identify a victim. This doesn’t always include DNA. In an airplane crash only about 25% of victims are identified by only using DNA analysis as a method of identification. The parameters are set by policy makers and it is the job of the laboratory director to determine the nature and extent of the laboratory`s response. [1].
Defining goal of identification process
Depending on the scope of the identification process, a decision can be made on if every human fragment is going to be identified. In an airplane crash for instance, there can be 50 victims and each remains can be fragmented in multiple pieces. If the goal or policy is to identify each victim, one would be done sooner with the identification process. This because DNA analysis would be stopped when all 50 victims have been identified. Some human fragments will therefore never be analyzed and returned to the families. Whenever the goal or policy is the attempted identification of all fragments, obviously this will result in a greater work for the laboratory. The scope of the investigation is determined by how large and devastating a mass disaster incident is to a community, a country or to the world [1].
Minimum fragment size
Depending on the goal or policy of the identification process a minimum fragment size for DNA testing needs to be established. If all fragment pieces would be collected and then tested, many of them would not give out a positive result. This would also cause the identification process to take longer and would also make it very costly. Even though the families are able to receive more human remains of their loved one, but they very well are going to be unprepared on the condition of the fragment and the time it might take to be able to identify them. Knowing all that, decisions will be made regarding the minimum fragment size, the statistical threshold that must be met. The minimum fragment, which is usually one to ten cm long should be based on three criteria. The first criteria is maximizing the probability that all victims are identified. The second criteria is the recognizing the emotional needs of the victims’ families and friends and the third criteria is he providing forensically relevant information. The defining of the fragment size is important, because this will affect the identification effort. It will determine how fragment sizes needs to be collected at the crash site, how they will be processed and the likelihood of getting a useable DNA profile [1].
DNA technology
A preliminary decision concerning the DNA technology that should be used is made by the laboratory. If the recovery effort is long, the DNA of the fragments will start to degrade and can affect your choice of DNA method for analysis. The decision to use another DNA method that are not the standard choices can be implemented, depending of the environmental conditions at the crash site and the resulting DNA degradation. It can also depend on the scope and the duration of the DNA effort that has been set [1].
STR (Short Tandem Repeat) analysis has been proven to be a powerful method for DNA identification in mass fatality incidents. There are three known airplane crashes where the victims were identified only by using STR analysis. Remains from after WTC attacks, demonstrated that STR`s work on degraded tissue and bone fragments if DNA extraction is optimized. Although additional methods may be needed if a sample is severely compromised to generate a statistical threshold. Other methods that can be used are mtDNA (Mitochondrial DNA) and SNP (Single nucleotide Polymorphism) [1].
Length of recovery effort
The length of the recovery effort it dependent on the location site and the mass fatality of the crash and determines the DNA identification of the victims. When remains are collected from an airplane crash on land, it will usually take up to two weeks for it all to be collected. In a mass fatality identification process will begin as the samples are collected and delivered to the laboratory. It is more effective and efficient to wait for all the samples to be collected, but with the pressure from the public and families for a rapid confirmation this is not possible. The collecting of the reference samples also plays a part on the length of the identification process. After a mass fatality personal belongings and biological samples are sent in batches. So the number of batches and the frequency of which they are sent to the laboratory is dependent on efficiency and duration of the reference collection process [1].
Laboratory workload
After considering all of the above, the laboratory`s must determine its analytical processes. Several key variables has to be assessed in order to determine this. These variables are the number of victims, number of recoverable fragments, percentage of samples to be reworked, numbers of personal items per victim, percentage of personal items to be reworked, personal items quality control samples and number of kinship analysis. In order to predict the labor and material resources required for DNA analysis, a DNA analysis workload worksheet can be used (see appendix 1) [1].
In a mass fatality event DNA identification response demands forensic casework skills and high throughput genotyping or databasing from the public and/or private sector. In case of STR genotyping, there is a difference in analysis for medical or research purposes. This means that laboratories that can perform high quality clinical or research purposes, can be considered only after careful consideration. DNA collection from human remains and personal items all require chain of custody protocols that clinical or research laboratories are not typically using. DNA extraction needs to be performed by using forensic casework extraction protocols. Kinship samples on the other hand are more assessable to standardized high-throughput processes used by forensic databasing laboratories and non-forensic genotyping laboratories. In most cases forensic databasing laboratories have the experience with outsourcing work to private laboratories then forensic casework laboratories. In case that work is outsourced to other laboratories, it is important to ensure that all laboratories involved use the same molecular ladders as size standards for allelic interpretation. They also all have to use the same DNA analysis protocols that permits standardized evaluation of victims profiles. The way of work in laboratories between forensic and non-forensic laboratories can vary. It is the duty of the laboratory director to fully define certain terms like ‘acceptable positive and negative controls’ and ‘standard reaction volume’ [1].
Next to DNA analysis the laboratory might also be responsible for sample accessioning and tracking, making identifications and resolving metadata problems, quality control, interacting with families and media and long-terms sample storage. Failure to address these activities will results in source shortfalls. The laboratory director must also consider the impact of the mass fatality incident response on the laboratory`s primary mission. Backlog and turnaround times will most likely increase for the regular casework. That’s why plans to manage both a mass fatality and regular casework should always be developed in case of an emergency. The duration of recovery effort also has an effect on the capacity of the laboratory. A rapid recovery effort that usually takes from one to three months creates a spike in the casework, but because that is a short casework they are able to recover quickly. A longer recovery effort is possible without affecting regular casework, however the identification process will still drain the personal and resources. Good planning is critical in order to migrate the disruption [1].
Sample tracking
Sample tracking is an important factor in ensuring quality and accuracy throughout the process of DNA identification. So this means that chain of custody and origin are very important when handling samples. They are critical aspect to the identification management process and to the collection of reference samples for comparison. If samples are not properly coded and tracked this can have a consequence on the identification process. Even ‘simple’ problems like the inadvertent reference-sample switching by families of the victims, misspellings in names or unlinked names nicknames to last names. Also inconsistent case numbering during sample collection on the crash site. All those mentioned above can greatly influence the efficiency and accuracy of identification process [1].
Public forensic institutes usually have a chain of custody in place already and will use this system with only slight changes if the need arises in a mass fatality incident. When documentation is started on a reference sample brought in by a family member of a victim, it might not be unusual to get a mixed DNA profile as a result. So this is also an important note to make when getting a result on your samples. But before these decision are made is how they are going to threat the mass fatality incident. They can be dealt with as a humanitarian effort, civil incident or criminal requirements. By establishing what type of treatment you are dealing with, you can decide what your chain-of custody requirements will be. The implications of treat incidents are shown in figure 1 [1].
Figure 1: An overview of the possible treat incident and their implications [1]
Laboratories will already have an information management system in place. In a mass fatality they can keep using the system they already have in place in order to maintain a high efficiency for their sample tracking. If chosen to modify the system the director of the laboratory should consider tracking the mass fatality samples separate from their regular casework samples. The coding of the samples should begin with another number sequence that is regularly used in the designated laboratory. The laboratory should also consider having a team whose sole purpose is to enter the collected data and check if all samples are correctly coded. Another task that the laboratory director has to establish is the sample naming that differentiates the personal items, kinship samples and the disaster samples from each other. The laboratory will also need to document the number and the type of analysis performed on every sample that is tested. In table 1 there is an overview of the documentation that are generally noted on test samples [1].
Table 3: Coding that should be present on test samples [1]
Victim samples Personal effects samples Kinship samples
Identity of laboratory that performed extraction and analysis* Victim identification number Victim identification number
Extraction attempt number Identity of laboratory that performed extraction and analysis* Relationship to victim
Type of analysis Extraction attempt number
Plate number, tube number, well number, etc. Type of analysis
Plate number, tube number, well number, etc.
*These are usually noted in a multi-laboratory response
Sample collection
The recovery and preservation of samples are very important steps for the identification of human remains, especially when they are highly fragmented. It is very important for samples to be collected as soon as possible. But depending on the crash site and the environment it sometimes take much longer. Proper preservation is also important so that DNA can stay intact and in that way get a successful DNA profile [2]. Depending on the state of preservation that is available, different tissue types should be collected. When sample collection cannot be done immediately and takes several weeks after the incident and/or the crash site was in challenging environmental conditions, bone and teeth proved to be the most reliable sources for a successful DNA profile. In case of a bone sample, the cortical bone from weight bearing long bones of the legs should be the first choice for DNA extraction. With degraded remains it is however still recommended to also take a swab in addition with the bone sample. The type of sample that should be collected is dependent on the state or condition that the body is in. Figure 1 shows an overview of the post-mortem sample condition that is dependent on the condition of the body [2].
Because it is a possibility that commingling of DNA can be happen, this can often lead to false DNA-based associations. This is especially true for fragmented remains. Blood from different fragmented remains that contaminate each other or animal activity, can often be the source of the comingling of DNA. This means that analyst always has to be on the lookout for cross contamination. When collecting samples in the field, workers should never assume that fragments go together or belong to the same victim based on their appearance. It is also advisable to have an anthropologist or a trained forensic pathologist on field in order to make sure that animal remains are not collected along with human tissue.
Sample storage
The amount of samples that is going to be collected during a mass fatality may very well be too much for one laboratory to store. So in that case the laboratory that is going to be handling the analysis can ask other trustworthy laboratories to store a collection of their samples. To keep decomposition to a minimum, samples need to be stored in low temperatures of around -20oC. Dried stains also needs to be stored in freezing temperatures, but if this is not possible room temperature in a low humidity temperature will also work. Skeletal remains that are going to be stored also needs to be at room temperature. Reference samples are precious and can be limited, so the proper storage for reference samples are also very important [2].
DNA analysis
In order to be able to perform a successful DNA analysis, the laboratory needs to have sufficient additional processing are. Because of the mass fatality the laboratory is going to be dealing with a lot of samples and might need to consider using a robotics system. This can minimize human error and contamination. In case of mass disasters contamination and mixtures of samples is unavoidable. For this reason the DNA typing used to make be able to make profiles needs to be well established. The most used DNA typing methods which are STR and mtDNA are the most used for identification purposes. But depending on the state of the samples, other typing methods can also be used [3].
DNA extraction
Before DNA analysis can be performed, the DNA needs to be extracted from the reference and victims samples first. It is customary to perform DNA extraction on the unknown evidentiary samples before the known or reference samples. With samples from a mass disaster it may not be possible to still extract samples in that order. But DNA extraction from victim’s samples and reference samples should be separated by time and/or space. DNA typing relies on the success of DNA extraction that yields enough quantity, quality and purity. Because of this naturally the most desired extraction methods are the ones that minimizes DNA loss and the ones that overcome, removes or dilutes enzymatic inhibitors [3].
The DNA IQ’ System (Promega Corp, Madison, WI)
Current techniques that include organic solvents and ion exchange resins are time consuming, use more than one centrifugation step, uses toxic organic solutions and/or do not remove PCR (Polymerase Chain Reaction) inhibitors well enough. A system that was used for the extraction of WTC samples proved to be the best. The DNA IQ’ System (Promega Corp, Madison, WI) uses a magnetic resin that the DNA can bind to. It also has a denaturing agent that disrupts many types of cells/tissue. The DNA will then be purified by eluding it from the resin. DNA typing can then be performed without needing further preparation. See figure 3 for a schematic overview of the extraction. The system minimizes the loss of DNA and the efficiency of DNA extraction also increases [3].
Organic extraction
As is mentioned above organic extraction is also an extraction method that is widely used in the forensic community for DNA typing. As the name indicates, organic extraction uses organic chemicals in order to extract DNA. This occurs in four steps. In the first step EDTA is added to the lysis buffer. EDTA is added to prevent the degrading of DNA. Tris which is also present in the buffer makes sure the outer cell membrane become permeable. In the second step the proteins are denaturized and hydrolyzed. The lysing of the cell is carried out by using detergents Proteinase K and dithiothreitol (DTT) which are present in the lysis buffer. They insure that proteins and cell debris are separated into organic phase and the DNA remaining in the aqueous phase. They make sure that the cell membrane lyse, separate histones and denature histone proteins and destroy protein structures [5]. In the third step the proteins and cell debris are removed by adding phenol chloroform isoamyl alcohol to the mixture. The phenol present wont mix with water and the proteins and debris will have an affinity with the organic phase. After centrifuging the mixture will have visible layers. DNA will be present in the upper layer called the aqueous phase. After transferring this phase to a new tube, in the fourth step DNA can be purified by using alcohol precipitation [5]. Figure 4 shows a schematic overview of the organic DNA extraction.
Chelex extraction
Another popular extraction method in the forensic community is Chelex. Chelex comes in different purities with a purity of 100 being preferred in a forensic capacity. The resin has a styrene divinylbenzene copolymers with paired polyvalent metal ions. Because of this polyvent metal ions can bind to the resin. In an aqueous alkaline condition the chelex has increased affinity for heavy metal. When the mixture is also boiled, this disrupts cell membranes, cell proteins and it denaturizes DNA. After centrifuging the resin and cellular debris is separated. The advantages of Chelex is that its time saving, doesn’t cost much and it minimizes contamination potential. Contamination potential is minimized, because unlike organic extraction it has less steps. Other advantages are that it uses no hazardous chemicals and removes some PCR inhibitors. Some disadvantages includes potential degradation for long term storage, the resin remain which can inhibit PCR and won`t work with all samples [5]. A schematic overview of the chelex extraction is shown in figure 5.
Polymerase Chain Reaction (PCR)
After extraction is completed the DNA goes through PCR. PCR is an enzymatic process in which a specific DNA fragment is being replicated over and over again. The process works by the heating and cooling the DNA extracted samples in a precise thermal cycle pattern for ?? 30 cycles [6]. The PCR has three steps that forms one cycle [1]. In each cycle it starts with denaturing the DNA template by heat. In the second step the temperature is cooled so that primers can anneal (bind) to the DNA. In the last step the temperature is raised again at the optimal temperature for DNA polymerase so it can activate. The DNA polymerase extends the template strands (see figure 6) [6]. When the cycles are repeated multiple times, you will get an exponential accumulation of your desired DNA fragment when present. Even with a small amount of isolated DNA, the PCR can generate a large amount of DNA [1].
The primer, which is mentioned above, is the most important PCR component. The primer is a short DNA sequence that positions itself before the DNA fragment that needs to be copied. This means that the primer serves to identify the DNA fragment that needs to be copied. It is added in high concentration in relative to the DNA template in order to drive the PCR reaction. Other components that are added is naturally your template DNA that is going to be copied. Also added to the mix is a four nucleotide building blocks called dNTP`s and DNA polymerase that helps put the building blocks in the correct order on the template DNA. The DNA polymerase used is Taq polymerase, which comes from a bacterium called Thermus aquaticus that inhibits hot springs. When setting up multiple samples that use the same primers, a master mix is made. From the master mix an equal amount of volume is pipetted in the PCR tube. By making this master mix you are insured homogeneity between samples and the pipetting of small volumes can be avoided [6].
Multiplex PCR
In this version of the PCR more than one DNA fragment will be copied simultaneously. This is achieved by adding more than one primer to the mixture. In order for it to work properly, the primer pairs added should have similar annealing temperatures. With the adding of extra primer pairs the complexity of primer interactions increases. Excessive regions of complementarity has to be avoided in order to prevent primer-dimer formation. Primer-dimers are when added primers bind to each other instead of the template DNA. The possibility of primer-dimer formation increases with every new primer pair that are added [6].
DNA quantification
After DNA extraction, DNA quantification is essential for polymerase chain reaction (PCR) based analysis. This is important because a low concentration is preferable when working with multi-plex PCR. Generally an amount of 0,5 to 2,0 of DNA is optimal when using STR kits. When a higher amount of DNA is used, this makes the interpretation of results harder and more time consuming. With a low amount of DNA, this can result in loss of alleles. This makes it harder to properly compare profiles with each other [7].
Real-time quantitative PCR (qPCR)
Like it was indicated above, quantification test is performed to determine amount of DNA that can amplified. A test that can indicate the quality and quantity of extracted DNA or PCR products, is beneficial for determining steps that need to be taking for the analysis of the DNA. The most common approach to perform a qPCR is to use a flourogenic 5′ nuclease assay, better known as TaqMan or with the use of an intercalating dye, like the SYBR green, that is specific for double stranded DNA molecules. The TaqMan monitors changes of a fluorescence while the SYBR green detects formation of PCR product [7].
TaqMan
Probes for the TaqMan has two fluorescent dyes that emit at different wavelengths. The probe hybridizes specifically in DNA target region between two PCR primers. The probe has a slightly higher annealing temperature the primers. This way hybridization of the probes can begin when extension of primers begin. At the 5′ end of the probe, a reporter (R) dye is attached, while at the 3′ end a quencher (Q) dye is synthesized See figure 7. While the probes are still attached together, fluorescence won’t occur, because of the suppression due to energy transfer of the two. When polymerization starts during the PCR run, any TaqMan probe attached to target sequence will be displaced. Because Taq polymerase has a 5′ exonucleae activity. This means that it will chew at the TaqMan probe attached. When the dyes are released from the probe, they will begin to fluorescence. This means that, the more fluorescence that is signaled, the more target sequence is present as complementary to the probe [7].
Real-time PCR analysis
The PCR process is defined in three phases. The exponential phase, linear amplification and plateau region. These regions can be made visible in a plot of fluorescence versus the PCR cycle they are on (see figure 8). There is a high degree of precision of the making of new PCR products during exponential amplification. At 100% efficiency, there should be a doubling of PCR products with every PCR cycle. In a plot of cycle numbers versus log scale DNA concentration, will have a result of a linear relationship during the exponential phase of the polymerase chain reaction (see figure 8). The next phase is the linear phase. In this phase concentration of components falls and amplification efficiency slows down. This happens in an arithmetic increase. The final phase is the plateau region. This is where the production of the PCR products slowly stops. Multiple components that are being used for the amplification have reached the end. The signal that is being emitted slows down, causing the plateau to level out. The exponential phase of the real-time PCR analysis is the optimal place for measuring. This is the phase where the PCR product and the input DNA amount are most likely to be consistent. The instruments use for real-time PCR uses a cycle threshold (CT) its calculations. This value stands for the terms of cycles when the level of fluorescence passes a threshold that is set by the software of the real-time PCR instrument. The less cycles that is used to get to this threshold, the higher the input DNA molecules that was present in PCR reaction. This is why there is a plot of log DNA and CT for every sample in a linear relationship with a negative slope. A rise in fluorescence can be traced to initial DNA template amount, when it is compared to samples that has known concentration. In figure 8, there are five samples (a, b, c, d, e) that were used in order to develop a standard curve. If the samples used have a good consistency and precision, a sample that has an unknown concentration can be calculated from the standard curve [7].
Short Tandem Repeat (STR) analysis
STR analysis is the first method of analysis that is used [1]. 99,9% of the DNA in humans are the same. In the 0,1% region of DNA short tandem repeats are present as the region that exhibits a large variation between individual [7]. STR`s are short DNA sequence that keep repeating. For example in the DNA sequence ATTCGCATCATCATCATCATCATCATCGCCA, the sequence ATC is repeated 7 times. Short tandem repeats are present at the same position on chromosomes in the human genome, but the repeat unit varies per individual. With the help of PCR reaction a STR analysis can be performed. Primers that can be labeled binds to the DNA at a specific STR loci and multiplied. Because of the label that is attached at the primer, the amplified products can be detected at the end of the PCR reaction. By analyzing more than multiple STR`s at once, a DNA profile that is unique enough to identify an individual can be made [7].
Repositioning primers
Sometimes a DNA sample is too degraded to get a DNA profile using the standard STR markers. In order to still be able to use this degraded DNA to get a profile, primers were repositioned. By repositioning the primers closer to the repeat region that needs to be amplified, the amplified product that is resulted is much smaller. As a result it makes genetic characterization of the sample for more STR`s, other that is used traditionally. This makes STR miniplex invaluable for degraded samples [1].
Single Nucleotide Polymorphism (SNP) analysis
By using the SNP`s genetic markers, the PCR amplified product can be much smaller than is possible in a STR miniplex [1]. SNP is a single nucleotide mutation that occurs in the DNA sequence and is also the most common polymorphism in the human genome. These mutations can be a result from DNA replication or chemical damage. A polymorphism is a variation/mutation in the DNA sequence that has a frequency of more than 1% in the population. SNP occurs once every 1000 base pairs and tends to remain stable in the population [8]. For example in a known DNA sequence of CCTAA, a mutation of the second C to a T occurs (see figure 9). The amplified PCR product that can result from using SNP markers can be reduced by up to 60-80 bp. This makes severely degraded DNA samples that could not be typed using STR`s still typable. Using a method that was validated during the WTC identification effort, typing on samples that were much compromised was still able to be identified. The method uses a florescent detection system. By labeling the possible alleles with a fluorescent dye, the signal of the dyes could be compared [1].
Mitochondrial DNA analysis
Mitochondria is an organelle that resides outside the cell nucleus. They contain their own chromosomal DNA. mtDNA are smaller, circular in shape and are inherited maternally. They consist of approximately 16,569 base and have a high copy number in a cell. Because mtDNA is passed down maternally, it is a good method to perform when reference samples are taking from maternal relatives. Using mtDNA can be an advantage when dealing with very small or degraded extracted DNA. Because of the high copies number of mtDNA that is available, an identification can still be possible using mtDNA markers, instead of STR markers that are found in nuclear DNA [1].
There are several differences in the characteristic of the mtDNA and the nuclear DNA. The first being the size of the genome. Nuclear DNA has about 3,2 billion bp and is 99.75% of total DNA. While mtDNA has 16,569bp and 0,25% content DNA per cell. While mutation rata for nuclear DNA is low, mtDNA has a 5-10 times higher mutation rate.
The sequence of the mitochondrial DNA is a functional one and also highly conserved, it’s because of this that there isn’t a very big variation between individuals. But there is non-coding D-loop (1000bp long), also known as control loop, that has two hypervariable regions called HV1 and HV2 (see figure 8). The variations observed in this part of the DNA tend to be Single Nucleotide Polymorphisms (SNPs). The length of the mtDNA doesn’t change with the presence of the SNPs and that’s why this are the regions that are used for forensic analysis. Due to the lack of DNA reparation, mtDNA has a large mutation. This causes the variations between individuals. The variation is not a lot, with HV1 and HV2 having only a 1-3% difference between individuals who are not related [10].
When analysis is performed, mitochondrial DNA is extracted and then with the help of Polymerase Chain Reaction (PCR), the HV1 and HV2 regions are amplified. Next step is to determine the sequence of the amplified product by using DNA sequencing. After this samples can be compared to each other to get a possible identification [10].
Y-chromosome analysis
The Y-chromosome is small and found in only male individuals. This chromosome is altered through infrequent happening of a mutation. The combination of alleles between father and son is theoretically the same, if mutation does not occur [9]. The advantages of Y-chromosome analysis are that in a mixture profile with female DNA for example in a sexual assault case, male specific amplification can take place to separate the two. Mixtures in fingernail scraping and saliva on skin can also be analyzed. Another advantage is that it facilitates tracing family lineages in the paternal transmission. Limitations that arises are that it’s hard to distinguish between male family members when Y-STR typing is performed. Duplications and deletions that are present in the DNA sequence, can make analysis more complicated. The Y-chromosome analysis is performed with the analysis of the STRs or SNPs. Y-STRs changes more rapidly and thus has more variables. This makes it useful for forensic applications [10].
Capillary electrophoresis
Last step in analysis of DNA is the separation and detection of the amplified DNA. In order to separate and detect them a capillary electrophoresis is used. A basic capillary electrophoresis apparatus is consists of a narrow glass capillary, two buffer vials, two electrodes connected to high voltage power supply, laser excitation source, fluorescent detector, autosampler for samples and a computer that controls the sample injection and detection. The capillaries are made out of glass, have an internal diameter of 50 ??m and is 50 cm long. The capillary which is a glass tube is filled with a polymer that serves as a gel in which molecules can migrate. Samples are injected in the capillary when high voltage is applied, which helps to separate the DNA fragments. Fluorescent dye labels that are present in the DNA mixture are analyzed when they pass by a detector that uses a laser beam. The results are then analyzed and stored on a computer [10]. A schematic of the capillary electrophoresis is shown in figure 9.
Polymer solution
DNA separation occurs when DNA fragments separate through the capillary when voltage is applied. Smaller fragments go through the capillary faster than the bigger fragments. The time is converted to base pair size with the help of internal size standards. Several components can affect the DNA separation. The polymer used, the capillary, the buffer and the voltage used [10].
The two primary polymer are POP-4 and POP-6. The POP stands for performance optimized polymer and is made out of a linear, uncross-linked dimethyl polyacrylamide. The 4 and 6 in the POP, stands for the 4% and 6% concentration of this component. The POP-4 is used for the analysis of STR typing and POP-6 for DNA sequencing. Recently a Pop-7 has been introduced that can be used for both the STR typing and DNA sequencing [10].
Buffer
The buffer is used to stabilize and solubilize the DNA. It also provide a charge for the electrophoresis current. Buffer concentration is an important aspect, because when it’s too high this can result in overheating of the capillary and a loss of resolution. In order to avoid any problems with the running of the electrophoresis, the buffer should be replaced regularly with a fresh sample [10].
Capillary
The capillary is central for the separation during the electrophoresis. The capillary is made of glass or fused silica. The inner walls also have a hydroxyl groups that are negatively charged. Positive ions from the buffer, creates a double layer. The movement of ions creates an electroosmotic flow, that can be reduced or eliminated when a coating is added to the inside layer of the capillary (see figure 10). Uncoated walls create problems for reproducible DNA separation. Suggested is to change the capillary after 100-150 runs or whenever a decline of resolution is noted [10].
DNA sequencing
Sanger method for DNA sequencing is a process where the use of polymerase incorporation is involved. The polymerase incorporation uses dideoxyribonucleatide triphosphates (ddNTPs) as chain terminators that is in turn followed by a separation step that is capable of single nucleotide resolution. Because there is no hydroxyl group at the end of the 3′ ‘end of the nucleotide with a ddNTP the growing chain cannot continue growing. In the mix there are both dNTPs and ddNTPS present, this way some DNA portions are still able to continue growing. Therefor at the end of the reaction there will be a series of strands that only differs one molecule from each other. See figure 11 for a schematic overview of the DNA sequence. Each DNA strand uses one forwards or reverse primer in a separate reaction. Attached to the ddNTPs are fluorescent dyes. ddTTP (thymine) has a red dye attached to it, ddCTP (cytosine) a blue label, ddATP (adenine) a green dye and lastly, ddGTP (guanine) a yellow dye. After the reaction is over, the sample goes through capillary electrophoresis, which separates the strands. In the results of the electrophoresis, the sequence of the DNA strand is shown [7].
DNA Databases and CODIS
A DNA database is a database which stores DNA data and is used for analysis. There are different types of DNA databases. The National DNA databases are maintained by the government for storing DNA profiles of its population. These DNA profiles are based on PCR and STR analysis. DNA databases serve as valuable tools in aiding investigations. CODIS (COmbined DNA Index System) is a DNA database that consists of 13 STR loci’s. These are CSF1PO, FGA, TH01, TPOX, VWA, D3S1358, D5S818, D7S820, D8S1179, D13S317, D16S539, D18S51 and D21S11. Also the Amelogenin is used for gender determination [6] [10] [11].
STR Kits
To make the job easier and avoid confusion with buffers it is better to use a complete STR kit. These kits come with their own buffer and control DNA. There are a lot of distributors of complete STR kits. It all depends on the research what kind of kit there is needed. Multiplex STR kits allows the amplification of the 13 CODIS loci [6].
The most widely known and used kits are from Promega. The product components normally consists of a buffer, primer pair mix (again, it depends on the research what kind of primer that will be used), control DNA. These are for the PCR product. After the PCR, the kits come with an Allelic Ladder mix and an internal standard to be used for the gel. Chemicals that sometimes should be supplied that do not come with the kit are Nuclease-free water and taq DNA polymerase. The kits come with their own protocol and should be used in that specific order [6].
Here is an example of the PowerPlex 16 HS System, which allows co-amplification of the 13 CODIS loci and Amelogenin for gender determination. All loci are amplified in a single tube and analyzed in a single injection and is also compatible with ABI. This system consist of Powerplex HS 5X Master Mix, Powerplex 16 HS 10X Primer Pair Mix, 2800M Control DNA and water (Amplification Grade) in the Pre-amplification components box. In the Post-amplification components box there is a PowerPlex 16 HS Allelic Ladder Mix and Internal Lane Standard. How the mixes for the PCR amplification should be made is displayed below in figure 10 [6] [12].
Statistical interpretation of DNA profile
When it comes to getting a match with a DNA profile, it’s important to determine if the DNA profile are from the same individual or if maybe there is another individual with the same profile. It’s practically impossible to get a DNA from every individual. In order to still determine a match probability, Allele frequencies from different ethnic groups, from validated databases are collected [13].
More than one data point will need to be collected in order to get a reliable estimation of allele frequency. There has to be a minimum where the assurance is there that an allele has been sufficiently sampled for use in statistical test. An allele has to be observed for a minimum of five times in order to be qualified for use in a statistical calculations. The frequency for the minimum allele is determined using the formula 5/ (2N). N stands for the number of individuals and 2N for chromosomes counted before they pair [13].
Frequency calculations
In order to calculate the frequency of a DNA profile, the frequency of every allele analyzed are multiplied. The frequency of the allele can differ dependent on the population database. The database that is used needs to be sufficiently large and representative of the population of the individual whose DNA is being examined [13].
In figure 12, DNA profile frequencies are given for the U.S. Caucasian. Take for example allele 11 of D13S317 shown in figure 12. Samples of 604 alleles were taken where it was observed 205 times. This translates to an allele frequency of 0,33940. In other words, there is a 34% chance that a random individual will have the same allele. The same goes for allele 14 of D13S317. This allele was observed 19 times in a population of 604. Frequency is therefore q=0,04801. So an individual with the alleles 11 and 14 for the D13S317, can be calculated using the formula 2pq (p standing for the first allele and q standing for the second allele). Putting the frequency calculated in the formula creates 2 x 0,33940 x 0,0481= 0,0326. This frequency indicates that ??3% of the U.S. Caucasian population. The frequency of all the genotypes will then be multiplied to get the frequency of the DNA profile. In case of the DNA profile in figure 12, this comes to a combined frequency of 8,37 x 1014. So 1 in 837 trillion [13].
Quality assurance
In order to ensure that DNA testing is being performed correctly there are standards that the laboratories has to adhere to. These standards has to assurance the quality of the work. Every laboratory should have a quality assurance program in place.
Organization and management
First thing is to have a management staff that the authority and means to carry out the duty that is assigned to them and keep the quality assurance in the laboratory. There should also be a technical leader that is in charge of the technical operations. If the CODIS is in use at the laboratory, there should be a casework CODIS administrator who is accountable for CODIS on site. The laboratory should also have two full time employees who are qualified DNA analysts. The responsibility, authority and interrelation of every personal operational at the laboratory has to be specified and documented. Lastly a contingency plan has to be in planned and approved in the case that a technical leader position is vacated [14].
Personnel
The employees of the laboratory has to have the proper education and experience. The job description of the personnel needs to have a written job description that can be expended with documentation of the responsibilities, duties and skills. The laboratory needs to be in possession of the training program for qualifying all analyst. Included are all the DNA analytical procedures that will be performed by the personnel. Also all the practical exercises needs to include the examination of a range of samples that are routinely analyzed at the laboratory. The training program needs to teach and asses the skills and ability to perform a successful DNA analysis. The trainee needs to give an individual demonstration of capability, which shall be documented by the laboratory. When a new analyst is hired the technical leader is tasked with accessing the new hire`s previous training and be responsible for the proper documentation. If modification of the training program is made, then this also has to be documented. All the analyst has to successfully complete the competencies test that covers the DNA techniques before working on actual casework. A documented program needs to be in place so that the technical qualifications are being kept. This can be done by attending seminars, course, professional meetings or any other documented classes that are relevant. This all needs to add to a minimum of eight hours annually. If training is given by the laboratory itself, all relevant information about the training and the presenter needs to be documented. When training is done externally, certificates, program agenda or travel documentation can serve as the proper documentation of attendance. Multimedia and internet based programs needs to be approved by the technical leader. Attendance needs to be formally documented and when the program is finished the technical leader reviews and approves it. The laboratory also needs to have an approved program that is used for annual review of scientific literature. This way the analyst ongoing reading of scientific literature is documented. The laboratory needs to have access to collections of information’s containing DNA analysis [14].
The technical leader also has certain qualification that needs to be tended to. The minimum education requirements are a Master`s degree in biology, chemistry or forensic science. In addition they also need to have a successfully completed 12 semester or equivalent credit hours in the subject’s biochemistry, genetics, molecular biology and statistics or population genetics. In the 12 semester completed, there needs to be at least one graduate level course for the period of 3 semesters. The technical leader of also needs to have a minimum of three years of experience in a forensic DNA laboratory, in which the laboratory participated in the identification and evaluation of biological evidence in a criminal matter capacity. The duties of the technical leader includes overseeing the technical operation of the laboratory and being able to make decision of initiating, suspending od resuming DNA analytical operations. Other responsibilities include the evaluating and documenting of the validation of methods used and proposing new or modified analytical procedures. Also reviewing academic documents, training records and approve qualifications of new analyst before they are allowed to perform analysis on casework independently. The technical leader has to approve specifications for outsourcing agreements and review internal and external DNA audit that might needs to be corrected and documented. Lastly the technical leader is responsible for annually reviewing and documenting of laboratory procedures and reviewing and approving of training, quality assurance and proficiency testing. The technical leader needs to be a full time employee and needs to be accessible for consultation onsite, via the telephone or electronically [14].
As an employee of the laboratory, the analyst needs to have a minimum education of a bachelor`s degree or advanced degree in biology, chemistry or forensic science related area, The analyst is expected to have completed the course work successfully with subject areas of biochemistry, genetics, molecular biology, statistics and population genetics. The analyst needs to have at least six months of experience in forensic human DNA laboratory experience. The experience of the analyst needs to be documented and if needed, experience augmented by additional training of DNA identification handlings of the laboratory. Before participating in casework, the analyst needs to complete analysis of a range of samples that the laboratory regularly encounters and complete a competency test [14].
Facilities
Laboratories needs to be designed in a manner where the integrity of analyses and evidence is ensured. Entry to the laboratory needs to be controlled in order to prevent unauthorized personnel. When keys or combinations are handed out, these needs to be documented and limited to personnel that is designated by laboratory management [14].
Before PCR is performed, the DNA extraction and PCR set up, needs to be performed in either separate times or separate space. The amplified DNA, needs to be developed, processed and maintained on a room(s) that is separate from where the evidence examination, DNA extraction or PCR setup areas are performed. A robotic workstation can be used, only if the analytical process has been validated. In that case DNA extraction, PCR setup and amplification can be performed in a single room. Lastly procedures for the cleaning and decontaminating of the facilities and its equipment needs be in place and followed [14].
Evidence control
In order to ensure the integrity of the physical evidence, the laboratory needs to have a documented evidence control system. Evidence needs to be marked with a unique tag on package of the evidence. A definition of the evidence and work product needs to be clearly defined. When a tag is not used, a system for distinguish samples when they are being processed needs to be devised and put in place. Chain of custody needs to be documented and maintained, either on a hard copy or in an electronic format. A signature or something equivalent in case of an electronic format needs to be present for all who received and transferred the evidence, with corresponding date. Documented procedures needs to be in place in order to minimize loss, contamination and deleterious change of evidence. A proper placement for evidence storage and work product that is still in progress needs to be in place, which is controlled and secure [14].
Analytical procedures
Analytical procedure that are to be had and followed, needs to be approved by the technical leader. The procedures needs to be reviewed annually. Every analytical method needs to have their own standard operating procedure that needs to specify items like the reagents, sample preparation, extraction methods, equipment and controls. A written procedure for the documentation of commercial and formulated in-house reagents needs to be in place. Label commercial reagents with the name of the reagent and expiration date. Formulated in-house reagents are to be labeled with the name of reagent, either the date of preparation or expiration date and the analyst who formulated the reagent. Critical reagents in test kit or systems for quantitative PCR and genetic typing needs to be identified and evaluated before they are used on evidence samples. Thermostable DNA polymerase, primer sets and allelic ladders, also needs to be identified and evaluated [14].
Before DNA amplification, evidence samples needs to be quantified. Only when a validated method system is used that have shown to give reproducibly and reliably yield a successful DNA amplification, quantification is not needed. When quantifying, quantification standards are to be used. The positive and negative control needs to be amplified at the same time as the case samples at the same loci with the same set of primers as the case samples. Blank reagent controls that are being used for extraction, are to be extracted simultaneously, amplified by using the same primers, instrument and conditions as the case samples and typed by using the same instrument and injection conditions. The DNA procedures used by the laboratory are to be checked annually or when a major change is made in the method [14].
Reports
Procedure for the taking and maintaining casework notes needs to be in place. These needs need to support conclusion taken in the reports. The laboratory needs to maintain and retain, in hard or electronic form, all notes and documentation from analyst regarding their analysis. The casework reports needs to have the case identifier, a description of the evidence that was examined, description of the method used, the amplification system, results and/or conclusion, an interpretive statement, date that report is issued, the disposition of the evidence and lastly a signature and title [14].
Proficiency testing
Twice a year, with the first one occurring in the first six months and the second in the last 6 months f a year, a proficiency testing needs to be undergone by the analyst, technical reviewers, technicians and personnel appointed by the technical leader. Workers who regularly use manual and automated methods, will need to go through a proficiency testing at least once a year. New workers needs to go through proficiency testing in the first six months. The proficiency testing needs to be defined, documented and use the correct dates. The exact items from the proficiency testing that needs to be recorded are the test set identifier, the analyst/participant, date of analysis and its completion, copies of data and notes that supports the conclusions taken, the proficiency test results, discrepancies and if needed the corrective actions that were taken [14].
Sources
1. Lessons Learned From 9/11: DNA Identification in Mass Fatality Incidents https://www.ncjrs.gov/pdffiles1/nij/214781.pdf
2. M.prinz, A. Carracacedo; DNA Commission of the International Society for Forensic Genetics (ISFG): Recommendations regarding the role of forensic genetics for disaster victim identification (DVI); Forensic Science International: Genetics; Pages 3-12; 2007
3. B. Budowle, F.R. Bieber, A.J. Eisenberg; Forensic aspects of mass disasters: Strategic considerations for DNA-based human identification; Legal Medicine; page 230-243;2005
4. Mo Bi Tec molecular biotechnology
http://www.mobitec.com/cms/products/bio/06_dna_prot_tools/mbeads4.html
5. NFSTC Science serving justice
http://www.nfstc.org/pdi/Subject03/pdi_s03.htm
6. Forensic DNA typing STR: Page 62-64, 84-86,92,93
7. Butler John M.; Advanced topic in forensic DNA typing: methodology; 2012; Elsevier, Maryland USA, page 371, 372, 141-150,417,418
8. Sequencing, forensic analysis and genetic analysis http://www.atdbio.com/content/20/Sequencing-forensic-analysis-and-genetic-analysis#Short-tandem-repeats
9. Sirius genomics
10. Forensic science central; DNA
http://forensicsciencecentral.co.uk/dna.shtml
11. FBI biometric analysis
http://www.fbi.gov/about-us/lab/biometric-analysis/codis
12. Promega technical manual for the powerplex 16 system
https://nld.promega.com/resources/protocols/technical-manuals/101/powerplex-16-hs-system-protocol/
13. Butler John M.; Fundamentals of forensic DNA typing; 2010; Elsevier; Maryland USA; page 111-121, 229, 238-241, 251, 252
14. Quality Assurance standards for forensic DNA testing laboratories
http://www.cstl.nist.gov/strbase/QAS/Final-FBI-Director-Forensic-Standards.pdf
‘
Appendix I
DNA analysis workload estimated worksheet