Misplaced Pages

PubChem: Difference between revisions

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Browse history interactively← Previous editContent deleted Content addedVisualWikitext
Revision as of 05:08, 31 October 2014 editBG19bot (talk | contribs)1,005,055 editsm WP:CHECKWIKI error fix for #64. Do general fixes if a problem exists. - using AWB (10480)← Previous edit Latest revision as of 18:07, 29 April 2024 edit undoKku (talk | contribs)Extended confirmed users115,375 editsm link RNA interference 
(74 intermediate revisions by 53 users not shown)
Line 1: Line 1:
{{short description|Chemical information database}}
{{Refimprove|date=January 2009}} {{More citations needed|date=January 2009}}


{{infobox biodatabase {{infobox biodatabase
|logo=]
|title = PubChem |title = PubChem
|description = Chemicals and their bioassays
|logo =]
|description = PubChem
|scope = |scope =
|organism = |organism = Humans and other animals
|center = ]
|center =
|laboratory = |laboratory =
|author = |author =
Line 13: Line 14:
|standard = |standard =
|format = |format =
|url = {{URL|https://pubchem.ncbi.nlm.nih.gov/}}
|url =
|pmid = PMID 15879180 |pmid = 15879180
|download = |download =
|webservice = <ref>{{cite journal |last1=Kim |first1=Sunghwan |last2=Thiessen |first2=Paul A. |last3=Cheng |first3=Tiejun |last4=Zhang |first4=Jian |last5=Gindulyte |first5=Asta |last6=Bolton |first6=Evan E. |title=PUG-View: programmatic access to chemical annotations integrated in PubChem |journal=Journal of Cheminformatics |date=9 August 2019 |volume=11 |issue=1 |page=56 |doi=10.1186/s13321-019-0375-2|pmid=31399858 |pmc=6688265 |doi-access=free }}</ref>
|webservice =
|sql = |sql =
|sparql = |sparql =
Line 25: Line 26:
|frequency = |frequency =
|curation = |curation =

}} }}


'''PubChem''' is a ] of ] ]s and their activities against biological assays. The system is maintained by the ] (NCBI), a component of the ], which is part of the United States ] (NIH). PubChem can be accessed for free through a ]. Millions of compound structures and descriptive datasets can be freely downloaded via . PubChem contains substance descriptions and small molecules with fewer than 1000 atoms and 1000 bonds. More than 80 database vendors contribute to the growing PubChem database.<ref>{{Cite web|url = http://pubchem.ncbi.nlm.nih.gov/sources/sources.cgi|title = PubChem Source Information|work = The PubChem Project|location = USA|publisher = National Center for Biotechnology Information}}</ref> '''PubChem''' is a ] of ] ]s and their activities against ]. The system is maintained by the ] (NCBI), a component of the ], which is part of the United States ] (NIH). PubChem can be accessed for free through a ]. Millions of compound structures and descriptive datasets can be freely downloaded via ]. PubChem contains multiple substance descriptions and small molecules with fewer than 100 atoms and 1,000 bonds. More than 80 database vendors contribute to the growing PubChem database.<ref>{{Cite web|url = https://pubchem.ncbi.nlm.nih.gov/sources/sources.cgi|title = PubChem Source Information|work = The PubChem Project|location = USA|publisher = National Center for Biotechnology Information}}</ref>

==History==
PubChem was released in 2004 as a component of the Molecular Libraries Program (MLP) of the NIH. As of November 2015, PubChem contains more than 150 million depositor-provided substance descriptions, 60 million unique chemical structures, and 225 million biological activity test results (from over 1 million assay experiments performed on more than 2 million small-molecules covering almost 10,000 unique protein target sequences that correspond to more than 5,000 genes). It also contains ] (RNAi) screening assays that target over 15,000 genes.<ref>{{Cite journal |last1=Kim |first1=Sunghwan |last2=Thiessen |first2=Paul A. |last3=Cheng |first3=Tiejun |last4=Yu |first4=Bo |last5=Shoemaker |first5=Benjamin A. |last6=Wang |first6=Jiyao |last7=Bolton |first7=Evan E. |last8=Wang |first8=Yanli |last9=Bryant |first9=Stephen H. |date=2016 |title=Literature information in PubChem: associations between PubChem records and scientific articles |journal=Journal of Cheminformatics |volume=8 |pages=Article 32|doi=10.1186/s13321-016-0142-6 |pmid=27293485 |pmc=4901473 |doi-access=free }}</ref>

As of August 2018, PubChem contains 247.3 million substance descriptions, 96.5 million unique chemical structures, contributed by 629 data sources from 40 countries. It also contains 237 million bioactivity test results from 1.25 million biological assays, covering >10,000 target protein sequences.<ref name=":0" />

As of 2020, with data integration from over 100 new sources, PubChem contains more than 293 million depositor-provided substance descriptions, 111 million unique chemical structures, and 271 million bioactivity data points from 1.2 million biological assays experiments.<ref name="2021NAR" />


== Databases == == Databases ==


PubChem consists of three dynamically growing primary databases. As of 29 September 2014: PubChem consists of three dynamically growing primary databases. As of 5 November 2020 (number of BioAssays is unchanged):


* Compounds, 54 million entries <ref>{{cite web|url=http://www.ncbi.nlm.nih.gov/pccompound?term=all%5Bfilt%5D&cmd=search|accessdate=29 September 2014}}</ref> (up from 31 million entries in Jan 2011), contains pure and characterized chemical compounds.<ref>{{Cite web|url = http://www.ncbi.nlm.nih.gov/sites/entrez?term=all%5Bfilt%5D&cmd=search&db=pccompound|title = all[filt&#93; - PubChem Compound Results|work = The PubChem Project|location = USA|publisher = National Center for Biotechnology Information|accessdate = 7 January 2011}}</ref> * Compounds, 111 million entries<ref name="2021NAR">{{cite journal |last1=Kim |first1=Sunghwan |last2=Chen |first2=Jie |last3=Cheng |first3=Tiejun |last4=Gindulyte |first4=Asta |last5=He |first5=Jia |last6=He |first6=Siqian |last7=Li |first7=Qingliang |last8=Shoemaker |first8=Benjamin A |last9=Thiessen |first9=Paul A |last10=Yu |first10=Bo |last11=Zaslavsky |first11=Leonid |last12=Zhang |first12=Jian |last13=Bolton |first13=Evan E |title=PubChem in 2021: new data content and improved web interfaces |journal=Nucleic Acids Research |date=8 January 2021 |volume=49 |issue=D1 |pages=D1388–D1395 |doi=10.1093/nar/gkaa971|pmid=33151290 | pmc=7778930 |doi-access=free }}</ref> (up from 94 million entries in 2017<ref name=":0">{{cite web|url=https://www.ncbi.nlm.nih.gov/pccompound?term=all%5Bfilt%5D&cmd=search|title=Search Results for all compounds|access-date=28 January 2016}}</ref>), contains pure and characterized chemical compounds.<ref>{{Cite web|url = https://www.ncbi.nlm.nih.gov/sites/entrez?term=all%5Bfilt%5D&cmd=search&db=pccompound|title = all[filt&#93; - PubChem Compound Results|work = The PubChem Project|location = USA|publisher = National Center for Biotechnology Information|access-date = 7 January 2011}}</ref>
* Substances, 163.5 million entries<ref>{{Cite web|url = http://www.ncbi.nlm.nih.gov/sites/entrez?term=all%5Bfilt%5D&cmd=search&db=pcsubstance|title = all]s, ] and uncharacterized substances. * Substances, 293 million entries<ref name="2021NAR"></ref> (up from 236 million entries in 2017<ref>{{Cite web|url = https://www.ncbi.nlm.nih.gov/sites/entrez?term=all%5Bfilt%5D&cmd=search&db=pcsubstance|title = all]s, ] and uncharacterized substances.
* BioAssay, ] results from 6059<ref>{{Cite web|url = http://www.ncbi.nlm.nih.gov/sites/entrez?term=all%5Bfilt%5D&cmd=search&db=pcassay|title = all] programs with several million values. * BioAssay, ] results from 1.25 million<ref>{{Cite web|url = https://www.ncbi.nlm.nih.gov/sites/entrez?term=all%5Bfilt%5D&cmd=search&db=pcassay|title = all] programs with several million values.


== Searching == == Searching ==
Line 51: Line 60:
0:500 0:5 0:10 -5:5 0:500 0:5 0:10 -5:5


== Database fields ==
==History==


PubChem was released in 2004.<ref>{{cite web|url=https://pubchem.ncbi.nlm.nih.gov/about.html|title=About PubChem|accessdate=3 May 2014}}</ref>

==ACS's concerns==
The ] has raised concerns about the publicly supported PubChem database, since it appears to directly compete with their existing ].<ref>{{cite journal |doi=10.1126/science.308.5723.774a |date=May 2005 |author=Kaiser J |title=Science resources. Chemists want NIH to curtail database |volume=308 |issue=5723 |pages=774 |pmid=15879180 |journal=]}}</ref> They have a strong interest in the issue since the Chemical Abstracts Service generates a large percentage of the society's revenue. To advocate their position against the PubChem database, ACS has actively lobbied the US Congress.

Soon after PubChem's creation, the ] lobbied ] to restrict the operation of PubChem, which they asserted competes with their ].<ref>{{Cite web|url = http://osc.universityofcalifornia.edu/news/acs_pubchem.html|title = PubChem and the American Chemical Society|work = Reshaping Scholarly Communication|location = USA|publisher = University of California}}</ref>

== Database fields ==
{{Cleanup|section|date=November 2009}}
{| {|
|- |-
Line 143: Line 143:


* ] * ]
** CAS Common Chemistry - run by the American Chemical Society
** ] ** ] - run by North Carolina State University
** ]
** ] - run by European Bioinformatics Institute
** ]
** ] - run by UK's Royal Society of Chemistry
** ]
** ] ** ] - run by the University of Alberta
** ] - run by Swiss-based International Union of Pure and Applied Chemistry (IUPAC)
** ]
** ] - run by India's National Chemical Laboratory
** ]
** PubChem - run by the National Institute of Health, USA
** ]
** ] - run by the University of California, San Diego
* ] (NCBI)
** ] - run by the University of Toronto, Canada
* ]
** ] (NCBI) - run by the National Institute of Health, USA
* ]
** ] - run by the National Institute of Health, USA
* ]
** ] - run by the National Institute of Health, USA


==References== ==References==
Line 160: Line 161:


== External links == == External links ==
{{Wikidata property|P662|P2153|P2874}}

{{Scholia}}
{{Wikidata property|P662|PubChem IDs}}
* * {{official website|https://pubchem.ncbi.nlm.nih.gov}}
*
*
*
*
*


{{DEFAULTSORT:Pubchem}} {{DEFAULTSORT:Pubchem}}
Line 173: Line 169:
] ]
] ]
]

Latest revision as of 18:07, 29 April 2024

Chemical information database
This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "PubChem" – news · newspapers · books · scholar · JSTOR (January 2009) (Learn how and when to remove this message)
PubChem
Content
DescriptionChemicals and their bioassays
OrganismsHumans and other animals
Contact
Research centerNCBI
Primary citationPMID 15879180
Access
Websitepubchem.ncbi.nlm.nih.gov
Download URLFTP
Web service URLPUG-View
Miscellaneous
LicensePublic domain

PubChem is a database of chemical molecules and their activities against biological assays. The system is maintained by the National Center for Biotechnology Information (NCBI), a component of the National Library of Medicine, which is part of the United States National Institutes of Health (NIH). PubChem can be accessed for free through a web user interface. Millions of compound structures and descriptive datasets can be freely downloaded via FTP. PubChem contains multiple substance descriptions and small molecules with fewer than 100 atoms and 1,000 bonds. More than 80 database vendors contribute to the growing PubChem database.

History

PubChem was released in 2004 as a component of the Molecular Libraries Program (MLP) of the NIH. As of November 2015, PubChem contains more than 150 million depositor-provided substance descriptions, 60 million unique chemical structures, and 225 million biological activity test results (from over 1 million assay experiments performed on more than 2 million small-molecules covering almost 10,000 unique protein target sequences that correspond to more than 5,000 genes). It also contains RNA interference (RNAi) screening assays that target over 15,000 genes.

As of August 2018, PubChem contains 247.3 million substance descriptions, 96.5 million unique chemical structures, contributed by 629 data sources from 40 countries. It also contains 237 million bioactivity test results from 1.25 million biological assays, covering >10,000 target protein sequences.

As of 2020, with data integration from over 100 new sources, PubChem contains more than 293 million depositor-provided substance descriptions, 111 million unique chemical structures, and 271 million bioactivity data points from 1.2 million biological assays experiments.

Databases

PubChem consists of three dynamically growing primary databases. As of 5 November 2020 (number of BioAssays is unchanged):

  • Compounds, 111 million entries (up from 94 million entries in 2017), contains pure and characterized chemical compounds.
  • Substances, 293 million entries (up from 236 million entries in 2017 and 163 million in Sept. 2014), contains also mixtures, extracts, complexes and uncharacterized substances.
  • BioAssay, bioactivity results from 1.25 million (up from 6,000 in Sept. 2014) high-throughput screening programs with several million values.

Searching

Searching the databases is possible for a broad range of properties including chemical structure, name fragments, chemical formula, molecular weight, XLogP, and hydrogen bond donor and acceptor count.

PubChem contains its own online molecule editor with SMILES/SMARTS and InChI support that allows the import and export of all common chemical file formats to search for structures and fragments.

Each hit provides information about synonyms, chemical properties, chemical structure including SMILES and InChI strings, bioactivity, and links to structurally related compounds and other NCBI databases like PubMed.

In the text search form the database fields can be searched by adding the field name in square brackets to the search term. A numeric range is represented by two numbers separated by a colon. The search terms and field names are case-insensitive. Parentheses and the logical operators AND, OR, and NOT can be used. AND is assumed if no operator is used.

Example (Lipinski's Rule of Five):

0:500 0:5 0:10 -5:5

Database fields


Identification numbers
Identification number in current database
Substance identification number
Compound identification number
BioAssay identification number ,

General
Any database field
Comment
Deposition date ,
Depositor's external ID ,
Source name , ,
Source release date , ,
Medical Subject Heading (MeSH) term ,
MeSH tree node ,
MeSH pharmacological actions ,

Substance properties
Substance synonyms
IUPAC name ,
International Chemical Identifier (InChI)
Molecular weight , ,
Chemical elements ,
Non-Hydrogen atoms ,
Isotope count ,
Total formal charge , ,
Chiral atom count ,
Defined chiral atom count ,
Undefined chiral atom count ,
Hydrogen bond acceptor count ,
Hydrogen bond donor count ,
Tautomer count , ,
Rotatable bond count ,
XLogP ,

Compound properties
Compound synonyms ,
Component count ,
Covalent unit (molecule) count ,
Total bioactivity count

See also

References

  1. Kim, Sunghwan; Thiessen, Paul A.; Cheng, Tiejun; Zhang, Jian; Gindulyte, Asta; Bolton, Evan E. (9 August 2019). "PUG-View: programmatic access to chemical annotations integrated in PubChem". Journal of Cheminformatics. 11 (1): 56. doi:10.1186/s13321-019-0375-2. PMC 6688265. PMID 31399858.
  2. "PubChem Source Information". The PubChem Project. USA: National Center for Biotechnology Information.
  3. Kim, Sunghwan; Thiessen, Paul A.; Cheng, Tiejun; Yu, Bo; Shoemaker, Benjamin A.; Wang, Jiyao; Bolton, Evan E.; Wang, Yanli; Bryant, Stephen H. (2016). "Literature information in PubChem: associations between PubChem records and scientific articles". Journal of Cheminformatics. 8: Article 32. doi:10.1186/s13321-016-0142-6. PMC 4901473. PMID 27293485.
  4. ^ "Search Results for all compounds". Retrieved 28 January 2016.
  5. ^ Kim, Sunghwan; Chen, Jie; Cheng, Tiejun; Gindulyte, Asta; He, Jia; He, Siqian; Li, Qingliang; Shoemaker, Benjamin A; Thiessen, Paul A; Yu, Bo; Zaslavsky, Leonid; Zhang, Jian; Bolton, Evan E (8 January 2021). "PubChem in 2021: new data content and improved web interfaces". Nucleic Acids Research. 49 (D1): D1388–D1395. doi:10.1093/nar/gkaa971. PMC 7778930. PMID 33151290.
  6. "all[filt] - PubChem Compound Results". The PubChem Project. USA: National Center for Biotechnology Information. Retrieved 7 January 2011.
  7. "all[filt] - PubChem Substance Results". The PubChem Project. USA: National Center for Biotechnology Information. Retrieved 28 January 2016.
  8. "all[filt] - PubChem Substance Results". The PubChem Project. USA: National Center for Biotechnology Information. Retrieved 7 January 2011.
  9. "all[filt] - PubChem BioAssay Results". The PubChem Project. USA: National Center for Biotechnology Information. Retrieved 28 January 2016.
  10. "all[filt] - PubChem BioAssay Results". The PubChem Project. USA: National Center for Biotechnology Information. Retrieved 7 January 2011.
  11. Cheng T (Nov 2007). "Computation of octanol-water partition coefficients by guiding an additive model with knowledge". Journal of Chemical Information and Modeling. 47 (6): 2140–2148. doi:10.1021/ci700257y. PMID 17985865.

External links

Scholia has a profile for PubChem (Q278487). Categories: