Measuring BLAST hits - mn/ibv/bioinfwiki (2024)

Contents

  • 1 Measuring BLAST hits
    • 1.1 Similarity: bitscores
    • 1.2 Significance: E-values
    • 1.3 Similarity vs. significance

Similarity: bitscores

BLAST implements several ways of evaluating the quality of a hit. The bitscore (or just score) is a single number, representing the quality of the BLAST-generated query/hit alignment. Higher bitscores imply better alignments, but note that alignments can become “better” both by increasing the length of the alignment, but also by bettering the match between the involved characters (i.e. identical amino acids rather than just similar amino acids). Given that the matrix, gap-opening and gap-extension costs are the same, bitscores can to some degree be compared between different BLAST searches. Thus you may conclude that we have a better query/hit match in search 1 than search 2 if search 1 produced a higher bitscore than search 2. But still, in search 1 a higher bitscore may be produced by a relative short, but perfect alignment. Search 2 may have resulted in a much longer, but imperfect alignment, scored lower than the search 1-result. In this scenario, it is not certain that the search1-result is “better” than the result of search 2. In the end, this must be determined by the user him/herself.

Significance: E-values

In addition to the bitscore, an e-value is reported for each BLAST hit. This value indicates whether this hit may be due to chance, rather than a real similarity between query and hit sequence. The e-value is based on the bitscore, but is transformed according to the sizes of the query and the database. This transformation implies that “good” e-values are very small positive values (in theory, this value may never equal zero; due to rounding of floating point numbers, BLAST may still report a zero e-value). “Good” in this context means that there is almost no possibility of the BLAST hit being caused by chance alone; a true similarity between query and hit can be assumed. In other words, the BLAST hit is statistically significant.

This calculation is assuming a random database and a random query sequence. Given this, and given a certain query-hit alignment, the e-value quantifies the possibility of finding a similar (or better) alignment. Note that the e-value is NOT a p-value (a p-value denotes the percentage possibility that a given result is caused by chance). Rather, the e-value is the number of hits to expect with a random database and random query that are as good as or better than our hit. Thus a hit with an e-value of 2 means that two hits equal to or better than this hit can be expected in a random scenario. In other words, this hit is quite likely to have been caused by chance, and is not significant. The usage of e-values instead of p-values is nothing more than a convention; the one can be transformed into the other: P=1-e-E (P=p-value, E=e-value). Furthermore, for small values (E<0.01), these numbers become quite similar.

As stated above, the e-value depends on the size of the database (and, to a lesser degree, on the size of the query). Increasing the database makes it harder to achieve e-values. The reason for this is simple: in the random model of our database, more sequences (or longer sequences) give more opportunities for finding a random hit. Imagine for instance throwing a dice 60 times. Probably around 10 sixes will result. Increasing the number of throws obviously also increases the resulting sixes (throwing 6000 times will produce around 1000 sixes).

This also means that two hits with equal alignments will have different e-values if they resulted from searches against differing databases. In fact, of two identical hits (i.e. identical queries, hit sequences and identical resulting alignments) one may be highly significant, whilst the other may be insignificant. This is then caused by the differing database sizes.

Similarity vs. significance

Often, e-values are used as an indication of the quality of a BLAST hit, i.e. the goodness of the underlying alignment. From the above, it should be clear that this is, at best, an imprecise way of determining goodness of BLAST hits. Some of the confusion often surrounding this is caused by people forgetting that the e-value quantifies the possibility of random hits, and nothing more. Accepting this, it is clear that the interpretation of the e-value does not change with database size; an e-value of 1e-50 is clearly significant (not caused by chance) no matter what database produced the hit. On the other hand, the bitscores (i.e. the qualities of the underlying alignments) that produced these e-values of 1e-50 may differ quite a bit. For these two hits with identical e-values, the bitscore of the hit produced by the largest database will be better than the bitscore of the other hit.

Measuring BLAST hits - mn/ibv/bioinfwiki (2024)
Top Articles
Sole Proprietor Business Insurance | Insureon
Top Paying Cybersecurity Jobs in 2024
Katie Pavlich Bikini Photos
My E Chart Elliot
Combat level
Ixl Elmoreco.com
Gabriel Kuhn Y Daniel Perry Video
South Park Season 26 Kisscartoon
Air Canada bullish about its prospects as recovery gains steam
Mr Tire Prince Frederick Md 20678
Sportsman Warehouse Cda
Graveguard Set Bloodborne
Mivf Mdcalc
Shooting Games Multiplayer Unblocked
Diablo 3 Metascore
Identogo Brunswick Ga
Uhcs Patient Wallet
Top tips for getting around Buenos Aires
24 Hour Walmart Detroit Mi
Studentvue Columbia Heights
Sony E 18-200mm F3.5-6.3 OSS LE Review
Curtains - Cheap Ready Made Curtains - Deconovo UK
Whitefish Bay Calendar
Craigslist Southern Oregon Coast
Mychart Anmed Health Login
Apple Original Films and Skydance Animation’s highly anticipated “Luck” to premiere globally on Apple TV+ on Friday, August 5
Sullivan County Image Mate
The Tower and Major Arcana Tarot Combinations: What They Mean - Eclectic Witchcraft
Teekay Vop
Jeff Nippard Push Pull Program Pdf
Chime Ssi Payment 2023
Watson 853 White Oval
Truvy Back Office Login
Gillette Craigslist
Taylored Services Hardeeville Sc
Emuaid Max First Aid Ointment 2 Ounce Fake Review Analysis
Winterset Rants And Raves
The value of R in SI units is _____?
Human Unitec International Inc (HMNU) Stock Price History Chart & Technical Analysis Graph - TipRanks.com
Envy Nails Snoqualmie
Skyrim:Elder Knowledge - The Unofficial Elder Scrolls Pages (UESP)
Can You Buy Pedialyte On Food Stamps
Tugboat Information
3496 W Little League Dr San Bernardino Ca 92407
Birmingham City Schools Clever Login
Bill Manser Net Worth
Anthem Bcbs Otc Catalog 2022
My Eschedule Greatpeople Me
Reli Stocktwits
Diesel Technician/Mechanic III - Entry Level - transportation - job employment - craigslist
Southern Blotting: Principle, Steps, Applications | Microbe Online
Latest Posts
Article information

Author: Ms. Lucile Johns

Last Updated:

Views: 5422

Rating: 4 / 5 (61 voted)

Reviews: 84% of readers found this page helpful

Author information

Name: Ms. Lucile Johns

Birthday: 1999-11-16

Address: Suite 237 56046 Walsh Coves, West Enid, VT 46557

Phone: +59115435987187

Job: Education Supervisor

Hobby: Genealogy, Stone skipping, Skydiving, Nordic skating, Couponing, Coloring, Gardening

Introduction: My name is Ms. Lucile Johns, I am a successful, friendly, friendly, homely, adventurous, handsome, delightful person who loves writing and wants to share my knowledge and understanding with you.