Entrance I: group analysis(consistent with the original version)

Quick introduction to analysis procedure

1. Choose the species group that contains your subject species
The following table summarizes all of the species groups gathered by agriGO v2.0 at present:

CategoryClassificationCounts
Plant
Brassicaceae12
Poaceae29
Malvaceae6
Fabaceae16
Solanaceae12
Rosaceae5
Medicinal plant12
Tree29
Algae18
Animal
Fish20
Aves11
Amphibia3
Insecta56
Mammalia58
FungiSordariomycetes 5
2. Choose tool and set parameters
You should choose one tool to go forward. At the right side, several frames containing annotation text are interactive. The content will change depending on exact parameters you chose. You can make the help frames show or hidden by using HELP buttons at top-right of the page.

3. Submit your job and perform analysis
After submitting your job, the agriGO will pre-check the validity of your upload data. If your job is submitted successfully, a job ID will be given. Since the analysis process could take a while, you may close the waiting page and use the job ID to check the work later. Please note that results of your jobs will be stored on our server for THREE DAYS. After 7 days all information of the job will be deleted. If you want elongation contact me.

4. Explore results
The agriGO provides different ways to browse results of different tools. Some of them are flexible but you may need some specific setting to make them to castor to your own demands. And detailed introduction to these tools in the manual in the following will help you to achieve it.

How to use Singular Enrichment Analysis (SEA) analysis?

SEA is a traditional and widely used method. It is simple to use and simple to understand. User only needs to prepare a list of gene/probe names, and enrichment GO terms will be found out after statistical test from pre-calculated background or customized one.
STEP 1:
To use SEA analysis, you should firstly select the type of your query list, either single names or names with GO accession. If you choose using supported species in agriGO, you only need to provide a list of sequence identifiers. It should be noted that you would better select species and check all allowed ID types of corresponding species, then submit your IDs. Only allowed IDs are suitable to be analysis in this type mode. And you can mix your IDs from different types. Just ensure they are allowed IDs.

If you choose customized mode, you are no long limited by agriGO-owned species any more. You can use any IDs you have, but only be noticed IDs should attach with GO accession!

OK, theoretically you can just click Submit to perform analysis now and simply skip following steps. Nevertheless, if you want set more advanced parameters, then keep on reading this manual.

STEP 2:
Now you can set the background or reference. There three types: suggested backgrounds, customized reference and customized annotated reference. The default parameter is using suggested backgrounds. For each species, agriGO will give all possible the background types. To those species without a relatively completed profile, backgrounds from neighbored organisms are suggested. Users can select based on their practical need, otherwise use customized reference.

In the case that you do not want any of suggested pre-computated background, you can use customized reference instead. NOTE: IDs in reference list should from the same species that one selected above for query list.

Also you can use any IDs if you choose customized annotated reference mode, however, the price is to attach with GO accession to obtain such freedom. :)

You can paste direct or upload your file, for latter, please make sure the file no bigger than 4MB.

STEP 3:
The advanced options are optional but quite important. These options are default hidden, and need to one click to make them visible. In SEA analysis, there are three statistical test methods: hypergeometric, chi-square and fisher test.
When the input/query list is compared with the previously computed background, or is a subset of reference list, choose hypergeometric or fisher. When both of your query list number and reference list number are quite small, you may better choose fisher test. When the input/query list has few or no intersections with the reference list, the Chi-square tests are more appropriate. Next you can choose method to do the multi-test adjustment. Seven adjustment methods are available here, including: Yekutieli (FDR under dependency), Bonferroni, Hochberg, Hochberg (FDR), Hommel, Holm, False Discovery Rate. Though I would suggest perform adjustment test, you truly can turn off it and use no adjust. While you choose no adjust, then you may set significant level below higher. Terms under the cutoff of the significant level will be highlighted, and emphasized in analysis results, and it will affect your test output.
Minimum number of mapping entries means that GO annotations that do not appear in at least the selected number of entries will not be shown. In other word, higher you set the number, more entries needed to make one GO term appear in the analysis result.
Gene ontology type: Plant GO slim is a cut-down version of the GO ontologies containing a subset of the terms in the whole GO for plant.
Last, if you provide a mail address, a notification will be send when the analysis is completed with the link to the results. Providing a email address is optional to SEA analysis, because it is very fast.

Greeting! You can now click submit to perform the analysis. You can always get interactive help from the right help frames, and a detailed tutorial in this manual, if you still have any question then contact me directly. In the following we will discuss the outputs of the SEA analysis.

Singular Enrichment Analysis (SEA) Results

Part 1:
A brief summary of your job will be given. The job ID is useful within 7 days. A file containing all entities in the query list that can be annotated by GO associated with descriptions is able to download.

Part 2:
In this part, you can browse the hieratical graph result. Note that the graphical result was generated as separate graphs for each of the three GO categories, namely biological process, molecular function and cellular component. After select the category, uses can specified their favorite output format, graph rank direction and font size. The result format means which output format you preferred. The rank direction is used to define the direction in your output, for instance the direction in the example image is 'top to bottom' And the font size is self-evident that user can set smaller size if there are many nodes in their result.

Click the 'generate image' bottom after you set all parameters well. The graphical result will be presented according to your own settings. The graphical result is a GO hieratical image containing all statistically significant terms. These nodes in the image are classified into ten levels which are associated with corresponding specific colors. The smaller of the term's adjusted p-value, the more significant statistically, and the node's color is darker and redder (Note: adjusted p-value here means that the value of the multiple-test adjusted p-value). Inside the box of the significant terms, the information includes: GO term, adjusted p-value, GO description, item number mapping the GO in the query list and background, and total number of query list and background. But when those term whose adjusted p-value is higher than the cutoff set by the user, only GO information will be given in the box. To better understand the graphical result, investigation of the annotation diagram is suggested. If user chooses PNG or JPG or GIF result format, linkage to the term's detail is available by clicking those blocks.

Part 3:
The terms selected here are children terms of root one (or called secondary level terms) or significant terms of secondary level terms. Thus, the bar chart gives user a brief portray since the GO terms are relatively general description. Similar to the procedure of graphical result, user should specified their parameters before create the GO abundance chart. User can try these setting to obtain favorite view of the chart bar. Note the setting you used will be recorded in your cookie and these settings will be default ones in your future jobs. In other word, you may try several times and make your last attempt as your own features.

Here the bar chart is using glass bar style, default colors, GO annotation as X legend, 14px font and 300 for X legend rotation. Here are some tips: 1. Have a glance of all four bar styles and select one you like, 2. Use HEX format to define colors and there is a website we already suggested, 3. If you prefer GO annotation as X legend content, you may use smaller font once there are too many words, 4. 270 to 315 is suggested for X legend rotation, in which 270 means vertical, and 315 means 45 degree slope, and you can try other number which may satisfy your taste but seems somehow strange to me ?. The bar chart is created based on scripts from Open Flash Chart. It is powerful. You can drag borders to resize and adjust the image size and ratio. And bars are accessible to term's detailed information. A Save as Image bottom is existed but only useful when you are using FireFox browser, and if you can also use your Print Screen bottom on your keyboard or other tools to download this image.

Part 4:
In this part, detailed information is given. All GO significant terms will presented in the following table. And you can browse the GO terms using tree traversing mode (we will discussed it later), or can browse all GO terms in the similar type table, or just the data. User can select terms to draw graphical result or create bar chart. Please note that the parameters used in graph or chart generating is fetched from your cookie, and your cookie will be set or changed when you generate graphical results or GO abundant chart which has been mentioned in part 2 and 3. While it will make you a bit trouble if you would like adjust the images created here to redo the part 2 or 3 work once more to change the settings. Click the checkbox left to 'GO term' can select all GO terms at one time.

You can click the GO name to collapse/extend ontology terms in tree traversing mode. A bottom that can make all significant selected or not is available and those selected terms can be used in drawing graphical results or to create bar chart. Please note that at least one significant term should be included in graph generation, otherwise the graphical result will be some kind blank and meaningless. Click on the number will lead you to terms detail information.

The term's detail page is as following. The agriGO will give all entries can be annotated to the term besides a brief summary. And for each entry the annotation includes: GO terms, GO source, description.



How to use Parametric Analysis of Gene Set Enrichment (PAGE)

HGE method is argued by Kim [BMC Bioinformatics 2005, 6:144]. Using Central Limit Theorem in statistics, this method is simple and efficient. Different to SEA, it takes expression level into account, and can deal with a long list of genes/probesets.
STEP 1:
Firstly, you should choose the species for your query data. Please make sure that identifiers in your input should be one of datatypes inside the right information table. If your identifiers are not stored in agriGO, there is another two ways: one is provided your own GO annotation file, the other is to use our BLAST4ID service.

STEP 2:
In PAGE analysis, user should pay more attention to input data. As presented in the following image, as least two rows must be provided. The first row is sequence identifiers, and followings are numerical value. The numerical value is fold change (FC) or log2-transformed FC value (latter preferred) of the identifiers' expression under different condition. If you do not have expression data, then SEA may be the alternative choice. In agriGO example, there are 3 rows in this example. First row is ATH1 probeset name, the second row is expression fold change (FC) value of cold treatment to CK(cold/CK) after half hour. Third row is expression FC of cold/CK after 24 hour cold treatment. Only 600 probesets are in the quick example for the fast load of the HTML page. To obtain a full view of PAGE method, you can download the full example file and explore the following analysis procedure.

STEP 3:
Next you can choose method to do the multi-test adjustment. Seven adjustment methods are available here, including: Yekutieli (FDR under dependency), Bonferroni, Hochberg, Hochberg (FDR), Hommel, Holm, False Discovery Rate. Though I would suggest perform adjustment test, you truly can turn off it and use no adjust. While you choose no adjust, then you may set significant level below higher. Terms under the cutoff of the significant level will be highlighted, and emphasized in analysis results, and it will affect your test output. Minimum number of mapping entries means that GO annotations that do not appear in at least the selected number of entries will not be shown. In other word, higher you set the number, more entries needed to make one GO term appear in the analysis result.
Gene ontology type: Plant GO slim is a cut-down version of the GO ontologies containing a subset of the terms in the whole GO for plant. If you can also upload your own customized GO annotation file once your identifiers are not accepted directly by agriGO. The file's size is limited to 4MB.

OK, now you can click submit to start analysis now. You may explore output of analysis results in the following part of manual.

Parametric Analysis of Gene Set Enrichment (PAGE) Result

Results generated by PAGE analysis have many similar points to SEA analysis, thus it is suggested to browse SEA result introduction part firstly. And only unique features to PAGE results will be explained. Since PAGE tool can analysis several rows at one time, and terms in each row will be calculated, each row has its significant GO terms. Number of significant GO terms for each row is listed in the brief summary part.

The number of terms is determined by the row you selected which is colored by red. A simple colorful model named CM for short is available. The color used in the CM is same to the color used in graphical result in which red color system means up regulated and blue means down regulated. And each block present the term's Z-score for the row. You can select row(s) and term(s) to generate further images.

The term's detailed information is generated if you click the number. This page may be a bit simple because it is quite possible that there are too many entries mapping to the GO.

In graphical result part, user can choose one or two rows to draw the image. If two rows are selected, a third color system (purple colors) will be used in demonstrating those terms have different regulation direction in two rows.

The following example presents two rows in one graph. You can check the annotation diagram below the result. There are three color systems: red means up regulated terms, blue means down regulated and purple presents the term is regulated in different direction in tow rows. And if the term has same regulated direction in both rows, it will have double borders. In the box 'r1=1e-10' means the adjusted p-value of the term in row1 is 1e-10, and 'zs' presents Z-score.

In bar chart generation part, Z-score is the statistical value in PAGE calculation, mean value is the mean of the value of all entries in the row. Mean change is mean minus standard deviation which presents the change of expression when comparing to the whole row background. While user can set two color values for up-regulation terms and down-regulation terms.

As mentioned before, Z-score which is bigger than 0 or smaller than 0 will be presented using different colors which set by user.

But if you choose mean value, they are in the same color since all mean is bigger than 0.

Entrance II: single species analysis

How to perform a Batch Singular Enrichment Analysis (SEA)

STEP 1:
You can select one of your species of interest or model species from the table for the first time to use agriGO v2.0. We have provided 10 kinds of model species related to the new classification system across different categories. The model plant Arabidopsis thaliana has been used as an example to illustrate the process.

STEP 2:
To use the SEA tool, you should first select the type of query list. You can input multiple query lists at a time by adding gene list boxes.

STEP 3:
Now, you can set the background or reference. Suggested backgrounds and customized references are provided. The default parameter uses the suggested background. For each species, agriGO v2.0 will provide all of the possible background types.

STEP 4:
The advanced options are optional but important. These options are hidden by default, and need to be clicked to make them visible. In the SEA analysis, the new option PVD (p-value distribution) was added. We used PVD to display the distribution of p-values of significant GO terms from query and random datasets. The line chart can illustrate the location and the percentage among the 99 random results. It is not calculated by default nor recommended for the first analysis.

Batch SEA results:

PART 1:
All of the successful job IDs are displayed in the analysis results. You can run a SEACOMPARE by selecting job IDs.

PART 2:
When the PVD option is selected as 'yes', a new column named PVD will appear in the 'Detailed information', in which two distribution line charts with different Y-axes (normal and logarithmic scales) for each significant GO term will be present.