Sept 22, 2017
Please run the following snippet to enable shared configuration:
RS=/mnt/research/quantgen echo -e "\nsource $RS/tools/configfiles/bash/bashrc" \ >> ~/.bashrc echo -e "\nsource $RS/tools/configfiles/bash/bash_profile" \ >> ~/.bash_profile touch /mnt/research/quantgen/tools/configfiles/bash/subscribers/$USER
Benefits: auto-loads R and PLINK, better defaults for working together, easier updates.
datasets
projects
scratch
tools
, logs
, shares
, etc.See: /mnt/research/quantgen/README
datasets
directorySubdirectories of a dataset:
source
(read-only, sometimes encrypted and with access control)
derivative
(read-only)
playyard
(read-and-write)
projects
directory.projects
directoryUKB/landscape
)
gruenebe
)scratch
directory/mnt/ls15/scratch/groups/quantgen
)For I/O-heavy projects:
$ crontab -l 0 0 * * * /mnt/research/quantgen/tools/cronjobs/ukb-500-output-transfer.sh $ cat /mnt/research/quantgen/tools/cronjobs/ukb-500-output-transfer.sh rsync -av /mnt/research/quantgen/scratch/projects/UKB/PIPELINE500/GWAS \ /mnt/research/quantgen/projects/UKB/PIPELINE500/output/
Let me know if you need help setting this up.
500k dataset was released.
All: 488,377 White British: 409,703
Calls: 805,426
Genotype Calls: /mnt/research/quantgen/datasets/UKB/source/genotypes/calls500
Phenotypes: /mnt/research/quantgen/datasets/UKB/source/phenotypes
(no changes)
Genotype-derived phenotypes: /mnt/research/quantgen/datasets/UKB/source/genotypes/sample_qc
Problem:
> The genetic data was imputed using two different reference panels. The > Haplotype Reference Consortium (HRC) panel was used as first choice > option, but for SNPs not in that reference panel the UK10K + 1000 Genomes > panel was used. The problem arose in the second set of imputed data from > the UK10K + 1000 Genomes panel. The genotypes at these SNPs are imputed > correctly, but have not been recorded as having the correct genome > position in the files. > For now we recommend that researchers focus exclusively on SNPs in the > HRC panel, or work with the directly genotyped data until the new release > is available.
http://www.ukbiobank.ac.uk/2017/07/important-note-about-imputed-genetics-data/
derivative
directory: /mnt/research/quantgen/datasets/UKB/derivative/
BED/calls500_unfiltered
(renamed original BED files)cohorts/calls500_unfiltered/whites
(white cohort, FID IID
)relabeled_phenotypes
(uses labels instead of cryptic field IDs)Project directory: /mnt/research/quantgen/projects/UKB/PIPELINE500
BED
(whites only, minor QC)phenotypes
and phenotypes_genetic
(whites only)adjusted_phenotypes
(height)cohorts
(genotyped_white
, genotyped_white_related
, genotyped_white_unrelated
)BGData
, summaries
GMatrix
and related_pairs
sample_sets
, GWAS
ld
, markers