E-values and Bit-scores in BLAST (2024)

Table of Contents
E-value Bit-score FAQs
  • E-value
  • Bit-score

E-value

The BLAST E-value is the number of expected hits of similar quality (score) that could be found just by chance.

E-value of 10 means that up to 10 hits can be expected to be found just by chance, given the same size of a random database.

E-value can be used as a first quality filter for the BLAST search result, to obtain only results equal to or better than the number given by the -evalue  option. Blast results are sorted by E-value by default (best hit in first line).

blastn -query genes.ffn -subject genome.fna -evalue 1e-10 

The smaller the E-value, the better the match.

-evalue 1e-50  

small E-value: low number of hits, but of high quality

Blast hits with an E-value smaller than 1e-50  includes database matches of very high quality.

-evalue 0.01

Blast hits with E-value smaller than 0.01 can still be considered as good hit for hom*ology matches.

-evalue 10   (default)

large E-value: many hits, partly of low quality

E-value smaller than 10 will include hits that cannot be considered as significant, but may give an idea of potential relations.

The E-value (expectation value) is a corrected bit-score adjusted to the sequence database size. The E-value therefore depends on the size of the used sequence database. Since large databases increase the chance of false positive hits,  the E-value corrects for the higher chance. It’s a correction for multiple comparisons. This means that a sequence hit would get a better E-value when present in a smaller database.

$E = m \cdot n  / 2^{bit-score}$

        $m$ - query sequence length

        $n$ - total database length (sum of all sequences)

Bit-score

The higher the bit-score, the better the sequence similarity

The bit-score is the requireds size of a sequence database in which the current match could be found just by chance. The bit-score is a log2 scaled and normalized raw-score. Each increase by one doubles the required database size (2bit-score).

Bit-score does not depend on database size. The bit-score gives the same value for hits in databases of different sizes and hence can be used for searching in an constantly increasing database.

From http://www.metagenomics.wiki/tools/blast/evalue

The E-value provides information about the likelihood that a given sequence match is purely by chance. The lower the E-value, the less likely the database match is a result of random chance and therefore the more significant the match is. Empirical interpretation of the E-value is as follows. If E < 1e - 50 (or 1 × 10-50), there should be an extremely high confidence that the database match is a result of hom*ologous relationships. If E is between 0.01 and 1e - 50, the match can be considered a result of hom*ology. If E is between 0.01 and 10, the match is considered not significant, but may hint at a tentative remote hom*ology relationship. Additional evidence is needed to confirm the tentative relationship. If E > 10, the sequences under consideration are either unrelated or related by extremely distant relationships that fall below the limit of detection with the current method. Because the E-value is proportionally affected by the database size, an obvious problem is that as the database grows, the E-value for a given sequence match also increases.

A bit score is another prominent statistical indicator used in addition to the Evalue in a BLAST output. The bit score measures sequence similarity independent of query sequence length and database size and is normalized based on the rawpairwise alignment score. The bit score (S) is determined by the following formula: S = (λ × S − lnK)/ ln2 where λ is the Gumble distribution constant, S is the raw alignment score, and K is a constant associated with the scoring matrix used. Clearly, the bit score (S) is linearly related to the rawalignment score (S). Thus, the higher the bit score, the more highly significant the match is. The bit score provides a constant statistical indicator for searching different databases of different sizes or for searching the same database at different times as the database enlarges.
it score provides a constant statistical indicator for searching different databases of different sizes or for searching the same database at different times as the database enlarges.

From https://www.biostars.org/p/187230/

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3820096/

The bit-score provides a better rule-of-thumb for inferring hom*ology. For average length proteins, a bit score of 50 is almost always significant. A bit score of 40 is only significant (E() < 0.001) in searches of protein databases with fewer than 7000 entries. Increasing the score by 10 b

From https://www.biostars.org/p/187230/

E-values and Bit-scores in BLAST (2024)

FAQs

What is the E-value and bit score in BLAST? ›

Bit scores are normalized, which means that the bit scores from different alignments can be compared, even if different scoring matrices have been used. The E-value gives an indication of the statistical significance of a given pairwise alignment and reflects the size of the database and the scoring system used.

How do you interpret E-value score in a blast search results? ›

Blast results are sorted by E-value by default (best hit in first line). The smaller the E-value, the better the match. Blast hits with an E-value smaller than 1e -50 includes database matches of very high quality. Blast hits with E-value smaller than 0.01 can still be considered as good hit for hom*ology matches.

What does an E-value of 0.0 mean in BLAST? ›

The E-value of 0.0 indicate the number of alignments with scores equivalent to or greater than that are expected to occur in a database by chance therefore the lower the E-value the more significant the score hence a better quality of the alignment blast search.

What is the E-value in PSI BLAST? ›

The e-value is a parameter that describes the number of hits one can “expect” to see by chance when searching a database of a particular size. It decreases exponentially with the score (S) that is assigned to a match between two sequences.

What is a good e-value? ›

The E-value is the expectation value that indicates the number of alignments with a score≥S that one can expect to find by chance in a database of size N. Hence, the E-value is dependent on the database size and the query length. The closer the E-value to 0, the better is the alignment. For E<1e−2 (=1×102=0.01), P≈E.

What is considered a large e-value? ›

A large E-value implies that considerable unmeasured confounding would be needed to explain away an effect estimate. A small E-value implies little unmeasured confounding would be needed to explain away an effect estimate.

How to interpret BLAST results? ›

Interpreting BLAST Results. BLAST results show all of the taxa that share sequence similarity with the query sequence based on the selected database. The results page includes a search summary, hit description table, graphic summary, and alignments that can help determine the quality or accuracy of a given hit.

Can e-values be negative BLAST? ›

Since E-values are estimates, not probabilities , they can be lower than 0. However, if NCBI BLAST retuns an e-value of say: "4e -19" do they mean: 4⋅10−19 (this would not result in negative values )

How do you interpret E numbers? ›

In statistics, the symbol e is a mathematical constant approximately equal to 2.71828183. Prism switches to scientific notation when the values are very large or very small. For example: 2.3e-5, means 2.3 times ten to the minus five power, or 0.000023.

What is a bad e-value? ›

10e-10 < E-value < 1 Could be a true hom*ologue but it is a gray area. E-value > 1 Proteins are most likely not related. E-value > 10 Hits are most likely junk unless the query sequence is very short.

What does an e-value of 1 mean? ›

Interpreting E-values

The E-value describes the number of hits we expect to see by chance when BLASTing a database. It helps us understand if our hits are relatively unique or not. For example, an E-value of 1 means that one expects by chance to see 1 match with a similar score.

What is the difference between P value and E-value in BLAST? ›

The BLAST programs report E-value rather than P-values because it is easier to understand the difference between, for example, E-value of 5 and 10 than P-values of 0.993 and 0.99995. However, when E < 0.01, P-values and E-value are nearly identical.

What is the difference between e value and expect threshold? ›

Expect threshold is the expected number of chance matches in a random model. In this case the E-value show the expected number of hits with a given score.

Why is PSI-BLAST better? ›

PSIBLAST may be more sensitive than BLAST, meaning that it might be able to find distantly related sequences that are missed in a BLAST search.

What is the max score on BLAST? ›

Max[imum] Score: the highest alignment score calculated from the sum of the rewards for matched nucleotides and penalities for mismatches and gaps.

What is the bitscore? ›

A bit score is another prominent statistical indicator used in addition to the Evalue in a BLAST output. The bit score measures sequence similarity independent of query sequence length and database size and is normalized based on the rawpairwise alignment score.

What is an e-value in statistics? ›

In statistical hypothesis testing, e-values quantify the evidence in the data against a null hypothesis (e.g., "the coin is fair", or, in a medical context, "this new treatment has no effect"). They serve as a more robust alternative to p-values, addressing some shortcomings of the latter.

What does the e-value of 6e 12 mean? ›

6e12 is shorthand for 6 x 10^12, which means 6 multiplied by 10 to the power of 12. To calculate the value of 6e12, first note that 10 to the power of 12 is equal to 1000000000000. Then, multiply 6 by 1000000000000 to get 6000000000000. Therefore, 6e12 is equal to 6000000000000.

Top Articles
The Death of the Eight Hour Workday? | IRIS FMP
Should You Prepay Gratuities on Your Next Cruise?
Cranes For Sale in United States| IronPlanet
Cappacuolo Pronunciation
Bj 사슴이 분수
Riegler &amp; Partner Holding GmbH auf LinkedIn: Wie schätzen Sie die Entwicklung der Wohnraumschaffung und Bauwirtschaft…
414-290-5379
Olivia Ponton On Pride, Her Collection With AE & Accidentally Coming Out On TikTok
Keurig Refillable Pods Walmart
OSRS Dryness Calculator - GEGCalculators
Lenscrafters Huebner Oaks
About Us | TQL Careers
Louisiana Sportsman Classifieds Guns
Harem In Another World F95
Praew Phat
Craigslist Pet Phoenix
Kirksey's Mortuary - Birmingham - Alabama - Funeral Homes | Tribute Archive
Apple Original Films and Skydance Animation’s highly anticipated “Luck” to premiere globally on Apple TV+ on Friday, August 5
Sullivan County Image Mate
Rimworld Prison Break
Roane County Arrests Today
Craigslist Maryland Trucks - By Owner
Milwaukee Nickname Crossword Clue
Malluvilla In Malayalam Movies Download
Xxn Abbreviation List 2017 Pdf
Bfsfcu Truecar
Kristy Ann Spillane
Nacho Libre Baptized Gif
1-800-308-1977
R&J Travel And Tours Calendar
Craigslist Georgia Homes For Sale By Owner
Studio 22 Nashville Review
Aliciabibs
How to play Yahoo Fantasy Football | Yahoo Help - SLN24152
Ashoke K Maitra. Adviser to CMD&#39;s. Received Lifetime Achievement Award in HRD on LinkedIn: #hr #hrd #coaching #mentoring #career #jobs #mba #mbafreshers #sales…
St Anthony Hospital Crown Point Visiting Hours
Lonely Wife Dating Club בקורות וחוות דעת משתמשים 2021
Seminary.churchofjesuschrist.org
Arigreyfr
Bill Manser Net Worth
Senior Houses For Sale Near Me
Vci Classified Paducah
Bf273-11K-Cl
Is Chanel West Coast Pregnant Due Date
Quest Diagnostics Mt Morris Appointment
View From My Seat Madison Square Garden
Craigslist Pets Lewiston Idaho
Craigslist Yard Sales In Murrells Inlet
Naughty Natt Farting
Turning Obsidian into My Perfect Writing App – The Sweet Setup
Latest Posts
Article information

Author: Lilliana Bartoletti

Last Updated:

Views: 5806

Rating: 4.2 / 5 (73 voted)

Reviews: 88% of readers found this page helpful

Author information

Name: Lilliana Bartoletti

Birthday: 1999-11-18

Address: 58866 Tricia Spurs, North Melvinberg, HI 91346-3774

Phone: +50616620367928

Job: Real-Estate Liaison

Hobby: Graffiti, Astronomy, Handball, Magic, Origami, Fashion, Foreign language learning

Introduction: My name is Lilliana Bartoletti, I am a adventurous, pleasant, shiny, beautiful, handsome, zealous, tasty person who loves writing and wants to share my knowledge and understanding with you.