This is an archive of the "beta" pre-release of the OpenAlex bibliographic metadata corpus. It was downloaded from AWS S3 "requester pays" bucket, then the individual files were compressed with gzip (pigz command), which reduced on-disk size significantly. Downloads of some files needed to be restarted, which seems to have worked ok, but potentially could have introduced corruption. This initial snapshot is dated in file names as "2021-10-11", and that date is used...
( 1 reviews )
The dataset is first introduced in the following paper: Siqi Wu, Marian-Andrei Rizoiu, and Lexing Xie. Variation across Scales: Measurement Fidelity under Twitter Data Sampling . In AAAI International Conference on Weblogs and Social Media (ICWSM), 2020. Complete/Sample retweet cascades datasets These datasets contain 2 pairs of complete/sampled retweet cascades on topic Cyberbullying (sampling rate: 0.5272) and YouTube (sampling rate: 0.9153). Each line is a cascades for a root tweet, in the...
Source: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/GW9GDM&version=1.1
sub-49
Source: https://figshare.com/articles/dataset/sub-49_rar/16566870/1
This data set contains a set of benchmark instances for Petri net reachability and coverability, which were used to evaluate the TACAS'21 submission "Directed Reachability for Infinite-State Systems" (preprint available online (https://arxiv.org/abs/2010.07912)). This data set is published in order to allow authors of other tools for reachability in Petri nets to easily obtain a large set of benchmark instances, in particular including many reachability instances. See the enclosed...
Source: https://figshare.com/articles/dataset/Benchmark_Instances_for_Reachability_and_Coverability_in_Petri_Nets/14152007/1
The initial structure of Chignolin was generated starting from the cln025 peptide, with sequence TYR-TYR-ASP-PRO-GLU-THR-GLY-THR-TRP-TYR. The structure was solvated in a cubic box of 40A, containing 1881 water molecules and two Na+ ions to neutralize the peptide's negative charge. MD simulations were performed with ACEMD, using CHARMM22* force field and TIP3P water model at 350K temperature. A Langevin integrator was used with a damping constant of 0.1 1/ps. Integration time step was set to 4...
Source: https://figshare.com/articles/dataset/Chignolin_Simulations/13858898/1
scATAC-seq dataset processed for inclusion in scATAC.Explorer R package. scATAC.Explorer is an R package containing a curated collection of publicly available scATAC-seq datasets that can easily be searched and retrieved within R. Included datasets are processed into a consistent format for ease of analysis. This dataset was not generated by our lab. Please also give credit to the source the dataset was retrieved from, included in reference field.
Source: https://figshare.com/articles/dataset/FreshMouseBrainCellRanger1_2_0/14357105/1
A mirror of the Unpaywall (aka oaDOI.org) metadata corpus, primarily consisting of public open access flags for a large number of Crossref-registered DOIs (identifiers representing published journal articles and other works). For more information see: http://unpaywall.org/products/snapshot
Multidimensional Patterns in Oral Microbiome Data
Source: https://figshare.com/articles/dataset/Multidimensional_Patterns_in_Oral_Microbiome_Data/15075162/2
Our main objective was to investigate potential gains and losses in northern hemisphere marine forest habitat along temperate and Arctic coastlines under different climate change scenarios. In particular, we sought to investigate anticipated responses according to three ecosystem distribution categories: marine forests restricted to the Arctic (i.e. oceanic benthic waters with a mean average annual temperature of ca. blue->green->orange->red refers to 1.0->0 probability of...
Source: https://figshare.com/articles/dataset/ARCTIC_MARINE_FOREST_DISTRIBUTION_MODELS_SHOWCASE_SEVERE_NET_LOSSES_AND_TEMPERATE_SUCCESSION_UNDER_CLIMATE_CHANGE/14751753/4
sub-19
Source: https://figshare.com/articles/dataset/sub-19_rar/16564116/1
Contains data and code underlying results presented in Gutknecht, Aaron J.; Wibral, M. (2021): "Significant Subgraph Mining for Neural Network Inference with Multiple Comparisons Correction". CC0 Waiver
Source: https://data.goettingen-research-online.de/dataset.xhtml?persistentId=doi:10.25625/DIVFKP&version=1.0
Academic Data and Datasets
1
1.0
-
-
-
by
Jiong Yang; Wenqing Zhang; Yuxiang Wang; Xin Li; Mingjia Yao; Ye Sheng; Haiyang Huo; Lili Xi
data
eye 1
favorite 0
comment 0
The output files of MIP3D.
Source: https://springernature.figshare.com/articles/dataset/Output_files_part18_/14464971/1
Daily files of human thermal-stress indices for February 1994 over South and East Asia
Source: https://springernature.figshare.com/articles/dataset/HiTiSEA_1994-02/14562990/1
.
Source: https://zenodo.org/record/4773888
This item contains an annual copy of the ORCID public data file, as originally downloaded from: https://orcid.figshare.com/articles/dataset/ORCID_Public_Data_File_2021/16750535 See also: https://info.orcid.org/orcids-2021-public-data-file-is-now-available More details about this content and it's use available at: https://orcid.org/content/orcid-public-data-file This dataset is available under the public domain (CC-0).
906,382
906K
-
-
-
by
Internet Archive Web Group
This is an updated snapshot of the Microsoft Academic Graph corpus. Microsoft generously makes this corpus available at no cost under the ODC-BY "open data license" ( https://opendatacommons.org/licenses/by/1.0/ ). See the link for details; at a minimum this license requires downstream users to acknowledge the creator. You can read more about the corpus, including how to obtain updated copies on Microsoft Azure, a schema reference, etc, at the following URLs and in the following...
Downloaded from http://japanlinkcenter.org/top/material/material_metadata.html
A new global gross primary production dataset covering 1980–2018 (under review on Scientific Data)
Source: https://figshare.com/articles/dataset/BTCH_zip/14332580/3
Daily files of human thermal-stress indices for March 1988 over South and East Asia
Source: https://springernature.figshare.com/articles/dataset/HiTiSEA_1988-03/14560443/1
sub-47
Source: https://figshare.com/articles/dataset/sub-47_rar/16566861/1
This is an updated snapshot of the Microsoft Academic Graph corpus. Microsoft generously makes this corpus available at no cost under the ODC-BY "open data license" ( https://opendatacommons.org/licenses/by/1.0/ ). See the link for details; at a minimum this license requires downstream users to acknowledge the creator. You can read more about the corpus, including how to obtain updated copies on Microsoft Azure, a schema reference, etc, at the following URLs and in the following...
详情见note.txt
Source: https://figshare.com/articles/dataset/IJGIS_DATA/13078061/1
- main_code.zip contains the MATLAB code used in this study, also found in https://github.com/yagmurerten/Canstraint - canstraint_analytics.nb and canstraint_analytics_SI.nb are the Mathematica notebooks used for the analyses in the main text and supplementary information, respectively. - popcheck.m is an additional script that was used to extract data from final .mat files (generates analysis_scripts/supp_data/supp_mature_popsize.txt). - derived_data.zip contains the .rds files derived from...
Source: https://figshare.com/articles/dataset/ErtenKokko2021_canstraint/13537295/1
Wiki-Reliability: Machine Learning datasets for measuring content reliability on Wikipedia Consists of metadata features and content text datasets, with the format {template_name}_features.csv and {template_name}_txt.csv.gz respectively. For more details on the project and links to data reading and benchmarking: https://meta.wikimedia.org/wiki/Research:Wiki-Reliability:_A_Large_Scale_Dataset_for_Content_Reliability_on_Wikipedia
Source: https://figshare.com/articles/dataset/Wiki-Reliability_A_Large_Scale_Dataset_for_Content_Reliability_on_Wikipedia/14113799/1
Zone 28 biophysical gradients
Source: https://springernature.figshare.com/articles/dataset/z28_LANDFIRE_Biophysical_Gradient_Raster_Datasets/13053530/1
Model weights / tensorflow checkpoints for Max F. Burg et al. (2021): Learning Divisive Normalization in Primary Visual Cortex CC0 Waiver
Source: https://data.goettingen-research-online.de/dataset.xhtml?persistentId=doi:10.25625/0JCXYO&version=2.0
This data set contains the stellar velocities used in the paper. It also includes profiles and parameter distributions derived using CJAM. It does not include the raw MUSE data which is available in the ESO archive. CC0 Waiver
Source: https://data.goettingen-research-online.de/dataset.xhtml?persistentId=doi:10.25625/VCNHOR&version=1.0
ENDOR measurement raw data (263 GHz), summarized DFT results, simulation results and processing notebooks for re-creating all figures in the manuscript and supplementary information in the paper: "Distribution of H$^\beta$ Hyperfine Couplings in a Tyrosyl Radical Revealed by 263 GHz ENDOR Spectroscopy" This dataset is published used under the CC BY-NC-ND 4.0 license (Attribution-NonCommercial-NoDerivatives 4.0 International).
Source: https://data.goettingen-research-online.de/dataset.xhtml?persistentId=doi:10.25625/AAFR6T&version=1.0
Replication data for “The Hidden Costs of Requiring Accounts: Quasi-Experimental Evidence from Peer Production” by Benjamin Mako Hill and Aaron Shaw to be published in Communication Research . Replicating the analysis presented in the paper Replicating this analysis involves a number of steps. We have attempted to included the most “raw” versions of the data to allow replication of our full data pipeline. This includes three sources of data: MediaWiki XML dump files: We have included...
Source: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/CLSFKX&version=1.1
URL lists to PDFs on the web (and preserved in the wayback machine) which are likely to contain research materials.
Mirrored via torrent from academic torrents: https://academictorrents.com/details/0c6c3fbfdc13f0169b561d29354ea8b188eb9d63
Indentation simulations, where microspheres are pressed into layers of entangled actin networks. Final states of entangled networks of E38 served as initial states. CC0 Waiver
Source: https://data.goettingen-research-online.de/dataset.xhtml?persistentId=doi:10.25625/SP3BTR&version=1.0
publication of the ANLAN RISTOJA Project CC0 Waiver
Source: https://data.lipi.go.id/dataset.xhtml?persistentId=hdl:20.500.12690/RIN/IDDOAH&version=1.0
The MOOD project (MOnitoring Outbreak events for Disease surveillance in a data science context. H2020) has geo-referenced the data Google has published as a series of PDF files presenting reports on national and subnational human mobility levels relative to a baseline data of late January 2020. The details and the PDF files can be found at https://www.google.com/covid19/mobility/ . More detail on these files can be found at https://www.moodspatialdata.com/humanmobilityforcovid19 The first set...
Source: https://figshare.com/articles/dataset/Maps_of_human_mobility_change_during_the_COVID-19_outbreak/12130980/70
Hyperspectral radiance data. The zipped file contains an unaveraged hyperspectral radiance image in Matlab MAT format, size 1024 × 1344 × 33, and spectral radiance in W m −2 sr −1 nm −1 . The image is one of a pair of images acquired at different times. The zipped file also contains a BMP colour image, rendered from the radiance image, and an information document, which includes geographic location. The separate JPG image of the scene is for illustration only. Padding artefacts may be...
Source: https://figshare.com/articles/dataset/Ruiv_es_Cottage_16_52_-_Hyperspectral_Radiance_Image/13436318/1
goose and duck's CT data
Source: https://figshare.com/articles/dataset/CT_data/15043422/1
This is the updated 2021 ReproChecklist
Source: https://figshare.com/articles/dataset/Updated_2021_ReproChecklist/15040293/1
The human gut microbiome produces a complex mixture of biomolecules which interact with human physiology and play essential roles in health and disease. Crosstalk between micro-organisms and host cells is favored by different direct contacts, but also by the export of molecules through secretion systems and extracellular vesicles. The resulting molecular network, comprised of various biomolecular moieties, has so far eluded systematic study. Here we present a methodological framework, optimized...
Source: https://figshare.com/articles/dataset/TableS10/16940353/1
Data repository for the data underlying the Online Labour Index. See http://ilabour.oii.ox.ac.uk online-labour-index/ for details.
Source: https://figshare.com/articles/dataset/Online_Labour_Index_Measuring_the_Online_Gig_Economy_for_Policy_and_Research/3761562/1977
Academic Data and Datasets
1
1.0
-
-
-
by
Nebojsa Malesevic; Alexander Olsson; Paulina Sager; Elin Andersson; Christian Cipriani; Marco Controzzi; Anders Björkman; Christian Antfolk
data
eye 1
favorite 0
comment 0
Subject 13 signals
Source: https://springernature.figshare.com/articles/dataset/Subject_13/12807998/1
Academic Data and Datasets
1
1.0
-
-
-
by
David E. Shaw; John Klepeis; Alexander G. Donchev; Andrew G. Taube; Elizabeth Decolvenaere; Cory Hargus; Robert T. McGibbon; Ka-Hei Law; Brent A. Gregersen; Je-Luen Li; Kim Palmo; Karthik Siva; Michael Bergdorf
data
eye 1
favorite 0
comment 0
Full dataset, containing interaction energies calculated using CCSD(T), MP2, HF, and SAPT0, as well as dimer geometries.
Source: https://springernature.figshare.com/articles/dataset/Donchev_et_al_DES370K/12692714/1
A mirror of the Unpaywall (aka oaDOI.org) metadata corpus, primarily consisting of public open access flags for a large number of Crossref-registered DOIs (identifiers representing published journal articles and other works). For more information see: http://unpaywall.org/products/snapshot
This is a corpus of millions of citations from Wikipedia articles, for a subset of language wikis, created using the wikiciteparser Python library.
Academic Data and Datasets
1
1.0
-
-
-
by
Yolla German; Loan Vulliard; Anton Kamnev; Laurène Pfajfer; Jakob Huemer; Anna-Katharina Mautner; Aude Rubio; Artem Kalinichenko; Kaan Boztug; Audrey Ferrand; Jörg Menche; Loïc Dupré
data
eye 1
favorite 0
comment 0
This dataset contains the CellProfiler pipelines and the corresponding morphological measurements used in the study on morphological profiling of immunological synapses by German, Vulliard et al.
Source: https://figshare.com/articles/dataset/GermanVulliard2020_zip/11619960/2
This data is linked to the article published in Water Research with the same title. To run the SPA the user is redirected to the pyspa package (https://github.com/hybridlca/pyspa) that reads the csv files given in this dataset and produces results as those given in the xlsx file given in this dataset. Through the pyspa package the user can set different thresholds and/or select different countries to focus on.
Source: https://figshare.com/articles/dataset/Water_energy_and_carbon_dioxide_footprints_of_the_construction_sector_a_case_study_on_developed_and_developing_economies/12661580/1
Hyperspectral radiance data. The zipped file contains an unaveraged hyperspectral radiance image in Matlab MAT format, size 1024 × 1344 × 33, and spectral radiance in W m −2 sr −1 nm −1 . The image is one of a pair of images acquired at different times. The zipped file also contains a BMP colour image, rendered from the radiance image, and an information document, which includes geographic location. The separate JPG image of the scene is for illustration only. Padding artefacts may be...
Source: https://figshare.com/articles/dataset/Tib_es_Corridor_16_05_-_Hyperspectral_Radiance_Image/13434338/1
Assembled Illumina short reads from human oral cavity samples processed using the state-of-the-art DNA extraction strategies for shotgun metagenomics described in the following study: https://doi.org/10.1101/2021.03.03.433801 These two metagenomes are from the same individual's oral samples and generated from (1) the sample used for long-read sequencing of the same material to test HMW DNA extraction method 01 in our study (here named ORAL_ILLUMINA_METHOD_01_REPL_02_ASSEMBLED) and (2) the...
Source: https://figshare.com/articles/dataset/Assembled_Illumina_Short_Reads/14141819/1
BigWig and NarrowPeak/BroadPeak files from H3K27ac ChIP-seq and ATAC-seq datasets from the craniofacial regions of the human and mouse embryos.
Source: https://figshare.com/articles/dataset/Mammalian_Craniofacial_Epigenetics/15085245/5
ePoster for the conference abstract "The bumpy road of FAIRification in practice" presented at the GMDS 2021 conference (remote, Kiel) CC0 Waiver
Source: https://data.goettingen-research-online.de/dataset.xhtml?persistentId=doi:10.25625/8VUKQA&version=2.0
Data of Experiment E50. This experiment is a set of isolated filament simulations, and part of the PhD thesis of Ilyas Kuhlemann. Data of this experiment, covers intermediate frame rates. Together with data of its twin experiments E51 and E52, MSD data with large frequency ranges were plotted. CC0 Waiver
Source: https://data.goettingen-research-online.de/dataset.xhtml?persistentId=doi:10.25625/AOGTAS&version=1.0
This dataset contains intermediate data products for some experiments in the PhD thesis "Learned infinite elements for helioseismology" (J.Preuß) which compare different transparent boundary conditions for the Atmo model of the solar atmosphere. A brief description of the provided data is given below: The file "power-spectrum-Atmo-whitw-comp.out" contains the reference power spectrum obtained from the exact transparent boundary condition for the Atmo model based on...
Source: https://data.goettingen-research-online.de/dataset.xhtml?persistentId=doi:10.25625/WX473B&version=1.0
The column name represents the TCGA ID, and the row name represents the gene symbol.
Source: https://figshare.com/articles/dataset/dat2_zip/13515998/1
Cytoscape session accompanying: Amici, D.R. et al. A network of core and subtype-specific gene expression programs in myositis. Acta Neuropathol (2021). https://doi.org/10.1007/s00401-021-02365-5
Source: https://figshare.com/articles/dataset/myositis_cys/16602497/1
Attention data st for FLUX pipeline
Source: https://figshare.com/articles/dataset/FLUX_attention_data/16621738/1
An example of a sub-critical fracture development trigered by an acoustic wave.; An example of a DEM simulation of fracturing of a soft and flexible material under the tensional load.; A DEM simulation of fracturing of a strong and flexible material under the tensional load; The DEM simulation showing how a single “super-fiber” resistance to breaking apart a sample is built.; The perpendicular to tensional load view of temporal evolution (breaking) of inter-particle bonds (black dots) in...
Source: https://rs.figshare.com/articles/dataset/Trigered_sub-critical_fracturing_Fracturing_of_sof_and_flexible_material_Fracturing_of_strong_and_flexible_material_Super-fiber_resistance_to_breaking_DEM_and_Fiber_Bundle_model_similarity_from_Earthquake_physics_beyond_the_linear_fracture_/13627820/1
Download CBDB Standalone Database. The standalone version of the China Biographical Database (CBDB) contains data on over 420,000 men and women in MS ACCESS Format. Documentation is included. Project Website (2019-04-24) © China Biographical Database. Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.
Source: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/2UFYFG&version=1.1
Daily files of human thermal-stress indices for April 1983 over South and East Asia
Source: https://springernature.figshare.com/articles/dataset/HiTiSEA_1983-04/14559546/1
trwiki-67 dataset trwiki-67 is a language modeling dataset that contain 67 million words of raw wikipedia articles. It can be utilized as a benchmark for different language modeling tasks on character, subword, or word level. This dataset was extracted from a Turkish wikipedia dump on 20 July 2021. Preprocessing All lists and tables were removed from the articles, and the initial extraction from .xml dump was done using wikiextractor : Additionally, further preprocessing was applied to get rid...
Source: https://zenodo.org/record/5146001
URL lists to PDFs on the web (and preserved in the wayback machine) which are likely to contain research materials.
See: https://guide.fatcat.wiki/reference_graph.html License: CC-0
Downloaded from http://japanlinkcenter.org/top/material/material_metadata.html
Academic Data and Datasets
1
1.0
-
-
-
by
Eric Capo; Alexandra Rouillard; Charline Giguet-Covex; Kevin Nota; Aurèle Vuillemin; Peter Heintzman; Marco J. L. Coolen; Laura Saskia Epp; Isabelle Domaizon; Inger Greve Alsos; Laura Parducci
data
eye 1
favorite 0
comment 0
This folder is used to compile the raw data used in each of the case study included in this manuscript
Source: https://figshare.com/articles/dataset/Data_from_manuscript_Lake_sedimentary_DNA_research_on_past_terrestrial_and_aquatic_biodiversity_Overview_and_recommendations_/13007279/2
Academic Data and Datasets
1
1.0
-
-
-
by
Manuel de Pedro; Miquel Riba; Santiago González-Martínez; Pedro Seoane; Rocío Bautista; M. Gonzalo Claros; Maria Mayol
data
eye 1
favorite 0
comment 0
This VCF file includes the filtered SNP dataset for 238 Leontodon longirostris samples and 20 for the outgroup sister species Leontodon saxatilis (168,733 SNPs).
Source: https://figshare.com/articles/dataset/Leontodon_vcf/12903848/2
Zn-atz-oba gas sorption data
Source: https://figshare.com/articles/dataset/Zn-atz-oba_gas_sorption_data_Exp_Sim_xlsx/16571151/2
bigwigs
Source: https://figshare.com/articles/dataset/2-1_bw/13526330/1
agrypnia genome assembly submitted to NCBI. Contamination was filtered out using blob tools
Source: https://figshare.com/articles/dataset/agrypnia_filtered_fasta/13383092/1
Mass cytometry data of BM and HSPC cells.
Source: https://figshare.com/articles/dataset/Original_fcs_files_of_mass_cytometry/16528836/2
sub-18
Source: https://figshare.com/articles/dataset/sub-18_rar/16564113/1
Daily files of human thermal-stress indices for May 2001 over South and East Asia
Source: https://springernature.figshare.com/articles/dataset/HiTiSEA_2001-05/14569710/1
-Consumer demand for 3 dishes: Spaghetti Bolognese, meatballs with rice and peas, buns with sausage -Individually adapted choice based conjoint -Structure of data +1/3 raw data (wide data) +2/3 long data (formatted to long) +3/3 analysed_data (labelled and modelled data) +3/3 labelled data (=analysed data with label of each Variable in 2nd row) -We also provide the original survey: survey.pdf -and the coding in Stata 17: +1/3 how raw data was created +2/3 how data was transformed from wide to...
Source: https://data.goettingen-research-online.de/dataset.xhtml?persistentId=doi:10.25625/MZQGOO&version=1.0