Insilicase

deus ex computa

	Home


		Desktop programs


			Sequence Analysis


				ABI signal/noise ratio


				Phred Quality Score (1)


				Phred Quality Score (2)


				Phred quality database


					dbQC user guide


				CpGviewer


				Genescreen


				MethylViewer


				QSVanalyser


				SNP genotyping


				Troubleshooting


			NGS Analysis


			Mapping software


			Miscellaneous


		Web programs


		Lab calculations


		History

dbQC: ABI sequence quality tracking system.

Overview

With the advent of high through put Sanger sequencing technologies the elucidation a DNA sequence has become a trivial exercise to the point where in many institutes, sequencing has become a core service. However, this separation of labour has made it harder to track the cause of poor sequence data, since the bench researchers will know nothing of the events with in the sequencing service, while the service providers will know little about the template quality they are working with. To overcome this problem we developed dbQC to track the changes in sequence quality Phred scores over time with respect the DNA source, machine operator, individual machine or a specific capillary. Phred scores are created by the ABI base calling software, which needs to be configured to save this data in the form of *.phd.1 files. By following the trends in sequence quality it is possible to identify:

The point at which the capillaries need to be changed
‘Operators’ that need further training.
‘Users’ who consistently send samples that generate low quality sequences.

Program requirements

The program has no minimum requirements other than the presence of the .NET framework version 2.0, which can be obtained free of charge from Microsoft.com ( http://www.microsoft.com/downloads/details.aspx?FamilyID=0856EACB-4362-4B0D-8EDD-AAB15C5E04F5&displaylang=en). The sequence quality data is stored in a password protected Microsoft Access data file. However, MS Access is NOT required by this program.

Terms used in this walkthrough

When entering quality scores into the database it is necessary to link the scores to certain parameters used in its production. These have been termed ‘User’, ‘Operator’, ‘Machine’ and ‘Well’. While the latter two terms are strongly linked a physical object, the first two phrases may be used to describe a number of different entities that depend on the environment in which the program is used. ‘User’ can be used to describe the origin of the DNA temple, which could be an individual, a research group, a project or a specific method of DNA extraction or purification. Like wise, the term ‘Operator’ can refer to a person who operated the sequencing machine, performed the sequencing reactions or different service providers. Clearly the use of these terms is variable, but for the purpose of this document ‘User’ and ‘Operator’ will refer to individuals that send a DNA template to the operator that performs the sequencing reactions and then loads them on to the sequencing machine.

Analysis settings

The program calculates the highest average Phred score over a long and short length of DNA in each trace file. This score is not set to a specific sequence, but is the highest average Phred score of that length any way in the trace file. To pass, a sequence must have an average score of more than 40, e.g. if the highest scores are 41 over 150bp and 39 over 300bp the sequence is said to have passed at 150bp, failed at 300bp. Since only the sequence quality scores are maintained in the database these values can not be changed once the database contains data, therefore it is only possible to adjust these values when working with an empty data file (see ‘Adjusting the analysis settings’).

Using the program

Starting the program

When the program runs for the first time, it looks for the dbQC.mdb database file in the same folder as the program. If it does not find the file it will then prompt you to select the database file. The database file can be located on the current machine or on a network share, in which case multiple people can view the data. Once, it has been linked to a file it will look for this file each time it is run. If you have more than one database file it is possible to change between the different files using the ‘Admin’ > ‘Change database’ menu option (Figure 1).

Figure 1

The program’s interface contains five ‘tabs’ which expose different types of task. The commonest tasks are shown first, however to use the program the database must be set up correctly, therefore the walkthrough will describe these tabs in reverse order. To edit the database, you must logon as an Administrator via the ‘Admin’ > ‘Login’ menu

Adjusting the analysis settings

Since only the calculated quality score data is stored in the database changes in these settings can not be made retrospectively. Therefore to prevent stored data calculated with different analysis criteria, these setting can only be changed in an empty database. To change the settings select the ‘Settings’ tab (Figure 2).

Figure 2

The length of the short and long sequences over which the average Phred score is calculated is set using the upper two input boxes, while the Average Phred cut off value is set via the lower text box. The permitted values are shown in table 1. If you enter incorrect values the ‘Update’ button will be disabled and the label next to the erroneous value will be written in red.

Setting	Minimum value	Maximum value
Long sequence	100bp	1000bb
Short sequence	50bp	800bp
Average Phred cut off value	20	100

Table 1

Adding and editing ‘Users’, ‘Operators’ and ‘Machines’

To add a ‘User’, ‘Operator’ or a sequencing ‘Machine’ to the database select the ‘Edit’ tab (figure 3) and press the ‘User’ button to add either ‘Users’ or ‘Operators’ similarly press the ‘Machine’ button to add an ABI sequencing ‘Machines’. Since the program is designed to show trends in data, it is not possible to delete a person or machine from the database. Instead ‘Users’ should be set as ‘Not current, and ‘Machines’ set as ‘Not in use’.

Figure 3

Adding and updating ‘Users’ or ‘Operators’

As stated earlier the exact meaning of ‘User’ and ‘Operator’ may vary depending on the environment in which you work. However, the program assumes both to be people and so ‘Users’ and ‘Operators’ are added via the same form (Figure 4).

The database comes with two predefined ‘Users’ called ‘Admin’ and ‘Not set’. The Admin (password admin) object is used to log on to the system when setting up the database and should be renamed (with a new password) if the database is to be shared among multiple uses. The ‘Not set’ ‘User’ can not be changed and is used by the program as a default ‘User’ to link to wells not linked to a specific ‘User’ (see ‘Adding Phred score data’ below).

The form is composed of two panels; the upper ‘User’ panel is used to select the person, while the lower ‘Editing User’ panel is used to edit the person’s settings. To edit an existing person select their name from the drop down list and change their details in the lower panel. Similarly to add a new person, press the ‘New’ button to the right of the drop down list and enter their details in the lower panel. An explanation of these settings is given in table 2. Finally to save the changes to the database press the ‘Submit’ button.

Figure 4

User name	This is the name used to identify the person while using the database. It must be between 4 and 50 characters (including spaces)
Password	This is the password used by Administrators to login. It should be between 4 and 14 characters. (All ‘Users’ have passwords).
Administrator	If checked it allows the person to edit the database (i.e. add people, machines and plate data) otherwise the person can only see the trends.
Current user	It is only possible to link current ‘Users’ and ‘Operators’ to plate date.
Is the person an ‘Operator’	Only people marked as operators can be selected as running a machine/plate.

Table 2

Adding and updating ‘Machines’

The process of adding ABI sequencing machines is very similar to adding ‘Users’. To edit an existing machine select its name from the list in the upper ‘Machine’ panel while to add a new ‘Machine’ press the ‘New’ button, then edit the data in the lower ‘Editing machine’ panel. The machine name must be between 4 and 20 characters and is used to identify the machine when adding or viewing data. The number of capillaries is set by selecting a value from the list below the machine name text box. Finally, checking the ‘In use’ options allows the machine to be linked to new plate data.

Figure 5

Adding and deleting sequence quality score data

Since sequencing is performed on a per plate basis the addition and deletion of quality scores is also done per plate. Adding and deleting data is done via the ‘Plate’ tab (figure 6)

Figure 6

Adding Phred score data

To add Phred score for a sequencing plate press the ‘Add’ button on the ‘Plate’ tab to access the ‘Plate setup’ form (figure 7). This form is divided in to three panels the ‘Plate setup’ panel, the ‘Plate details’ panel and the ‘Phred files (*.phd.1)” panel. The ‘Setup panel’ enables you to select the ‘Machine’ and ‘Operator’ that processed the sequence reactions and the day on which it was done. The ‘Plate details’ panels allows you to enter a name for the plate and link each sequence reaction on the plate to a specific ‘User’. Pressing the ‘Generic’ button links all the plate’s wells to the default ‘Not set’ user, while pressing the ‘User defined’ button allows you to link each well to a specific ‘User’ as described below. Finally, pressing the ‘Select’ button in the ‘Phred files (*.phd.1)’ panels prompts you to select the folder containing the Phred score (*.phd.1) files. The program automatically links these files to the correct well by searching for the wells coordinates in the file name. This function requires the files to have the original names allocated to them by the analysis software with the formats of ‘B01_******.phd.1’, ‘****_B01_***.phd.1’ or ‘********_B01.phd.1’.

Linking a well to specific ‘User’

Pressing the ‘User defined’ button on the ‘Plate details’ panel displays the ‘Well set up’ form shown in figure 8. Each well in the plate is represented by a square labelled with the wells coordinates. These squares are initially pink to signify that the well is linked to the default ‘User’ (‘Not set’). To link a well to a ‘User’ select their name in the list of ‘User’ names at the bottom left corner of the form and click the appropriate square with the mouse, the square should then become blue. To reset the well to the default ‘User’ select ‘Not set’ from the list and click the square again. It is possible to link multiple wells with the same ‘User’ by holding down the ‘r’ or ‘c’ key and clicking a square with the mouse. This function is shown in figure 9, if the cursor is placed other well A01 and the mouse button pressed while the ‘r’ key is pressed the squares in the red box will be linked to the selected ‘User’ while holding the ‘c’ key down will link the well in the blue box. Similarly, if this is repeated with the cursor over well D06 the wells in the yellow (‘r’ key down) and green box (‘c’ key down) will be linked to the same ‘User’.

Figure 7

Figure 8

When you have finished linking the wells to ‘User’ press the ‘Save’ button to update the data, however if you do not want to save the changes, pressing the ‘Back’ button will discard the changes. Finally to calculate and store the quality data press the ‘Analyse’ button on the ‘Plate setup’ form.

Figure 9

Deleting the quality scores for a plate

To delete a plate press the ‘delete’ button on the ‘Plate’ tab (figure 6). This will display the “Delete a plate” form which allows you to select a plate from all those held in the system (figure 10). Initially, all the plates that where run between 28 August 2008 and the present day are listed in the list box in the lower ‘Select plate’ panel. It is possible to limit the plates shown in this list by run date and/or plate name. Limiting the plates by date is done by selecting the appropriate dates in the two calendar controls in the ‘Limit search by…’ panel and then pressing the ‘Limit’ button. To limit by plate name or partial name, enter the name in the text box below the calendar controls and then press the ‘Limit’ button. This will limit the names in the plate list to those that contain the text you have entered, irrespective of case. The list can be limited by both text and date simultaneously.

Figure 10

When a plate name is selected a brief summary is displayed above the ‘Delete’ button. Finally, to delete a plate press the ‘Delete’ button. Once the plate has been removed the date is permanently lost.

Viewing a plates Quality scores

To view the quality scores for a single plate, select the ‘View plate’ tab and press the ‘View’ button (figure 11). This will display the “Select a plate” form (figure 12) which is used in the same way as the ‘Delete a plate’ form described above. Once you have selected a plate press the ‘View’ button. This displays the form ‘Plate data’ form, the data is arranged as a grid mirroring the wells in a microtitre plate. If the average quality score for a well is higher than the cut off the square is a blue colour, whereas those that failed are red (wells H09 and H12 in figure 12). If a well is not linked to a Phred file the square is white. The list box at the bottom left of the form allows you to alternate between the quality scores for the long and short sequences.

Figure 11

Figure 12

Figure 13

Viewing quality score trends

The quality score data can be examined for long term trend or differences between, ‘Users’, ‘Operators’, machines or individual capillaries. The analysis criteria is selected via the ‘Phred score trends’ form (figure 15) which is displayed by pressing the ‘Trends’ button on the ‘View trends’ tab (figure 14).

The ‘Phred score trends’ form is composed of two panels the ‘Option’ panel and the lower ‘Limit results to a period of time’. The lower panel allows you to limit the analysis period, with the time window set using the two calendar controls. The upper panel contains three tabs that allow you to sort the data by ‘User’, machine ‘Operator’ and machine (figures 15a, 15b and 15c respectively). Pressing the ‘Show’ button in the bottom right of the form extracts the data from the database and displays it as a graph (Figure 16). To save the underlying data press the ‘Save’ button on the ‘Graph’ form.

Each tab contains a list of people or machines, only items ticked in this list are used in the trend analysis unless no item is selected in which case all the items are used. For instance in figure 15a the list contains the ‘User’ names ‘Admin’, ‘Guest’, ‘James’ and ‘Not set’ since only James is checked only data linked to ‘James’ would be displayed (figure 16a), however if none are selected then all the ‘Users’ will be included in the analysis (figure 16b). When data for more than one person is selected, each person’s data can be highlighted by selecting their name from the list at the bottom left corner of the form (figure 16c).

Figure 14

Figure 15

Figure 16

When viewing Operator and Machine data it is possible subdivide the data where using the check boxes to the left of the list of ‘Operators’ or ‘Machines’. When the ‘Group by machine’ option is checked on the ‘Operators’ tab it is possible to view quality data for each ‘Operator’/’Machine’ combination that exists in the database. Similarly, it is possible to sub divide the data in combinations of ‘Machine’, ‘Machine’/’Operator’, ‘Machine’/’Capillary’ and ‘Machine’/’Operator’/’Capillary’.