17 Sep 15:31

choishingwan

fadb155

MAF calculation

Update Log

Fix Segmentation fault related to MAF calculation

Assets 6

16 Sep 19:16

choishingwan

2.2.8

42f9d60

PRSet Clumping Fix

Fixed bug in PRSet clumping which caused different results to be generated when different sets were used
Fixed multi-threading memory detection problem. Multi-thread for set-based permutation should now work as expected

Assets 6

11 Sep 15:44

choishingwan

2.2.7

b6fdcff

Set based permutation speed up, memory mapping and more

Update Log

PRSice can now recognize gz files without without the gz suffix
Now have better check in place for parameters related to distance
Cleaning up some codes. Try to make the code base more readable
Fix some memory leak problem related to #128, #131, #137
Completely remove Pearson Correlation clumping. We don't have the manpower to maintain the code base, and Pearson correlation clumping does not provide enough benefit for us to consider supporting it.
Add in memory map feature (--enable-mmap). When large amount of memory is available, and when all genotypes are stored in the same file, --enable-mmap might help to speed things up a bit
Updated code for linear regression. Now adopted codes from RcppEigen
Update internal variable types for all score printing. We can now generate an all score file for more sample and thresholds. We can allow 5.270498e+17 samples if there's one threshold, or 1.844674e+18 thresholds if there are 500k samples.
Update internal variable types for p-value thresholding. Previously, if user require an ultra small step size (e.g. < 1e-20), PRSice will generate abnormal thresholds (see here). To accommodate this use case, PRSice will now detect whether the number of threshold required exceed what we can store and use a slower alternative to generate the thresholds.
Set based permutation were too slow to be practical. We perform some algebra tricks to speed the process up. For more detail, you can refer to our full manual.

Assets 6

05 Aug 19:14

choishingwan

2.2.6

fac8e78

Bug Fix

Update Log

Now output number of SNPs in each threshold for --no-regress analyses
Fix problem related to bed file. Can now read bed file with chromosome format of chrX
Fix problem related to --maf when --ld is used. When --ld is used, the calculation of --maf is done on the target file but using the SNP location within the reference, which causes problems.

Assets 6

30 Jul 19:46

choishingwan

2.2.5

a850a69

GLM improvement

Update Log

Fix problem with glibc error with 2.2.4. This was due to an array out of bound error, which is now fixed
We now adapted the code from the fastglm and speedglm package. For 100k samples and no covariates, we are expecting around 50% increase in speed. This new implementation also allow us to incoporate different link function and distribution "family" in the future if those functions are required.
--use-reef-maf function is also available now. --use-ref-maf will cause the missingness imputation to be performed using the maf of the reference sample instead of the target.

2019-08-05

Temporary upload the PRSet no memory check version for linux

Assets 7

25 Jul 15:15

choishingwan

2.2.4

6493aca

BGEN bug fix

Update Log

Fix #125, BGEN encoding was wrong for SNPs that are 100% confident to be the non-effective allele.
Fix problem of --remove and --keep with BGEN file when the sample ID is embedded within the BGEN file
Fix problem related --keep-ambig where base file filtering was unexpectedly affected by the --keep-ambig flag
Now re-implemented the check for 100% genotype missingness.
PRSice should now correctly terminate when none of the phenotypes were identify in the phenotype file
SNPs extraction / exclusion are now performed when reading the base file. This should provide a small performance boost

2019-07-29

Trying to fix problem of glibc error w.r.t the linux binary

Assets 5

15 Jul 15:16

choishingwan

2.2.3

a758b0b

Bug Fix

Update Log

Fix problem with --base-maf and --base-info flag with the Rscript
PRSice will now only issue a warning instead of error out when the info flag or the MAF flags isn't found in the base
Fix the log message output
Minor improvement to the file read during MAF calculation. Should in theory speed up the MAF calculation and avoid bed file read error

Assets 6

27 Jun 19:24

choishingwan

2.2.2

d258f4d

Bug Fix

Update Log

Disabled --pearson as it is currently bugged and I don't have time to fix it
Fix calculation of MAF when --maf is used. (for binary plink format)
Fix LD calculation for BGEN files
BGEN --hard results should now produce identical results as those generated from the plink2 -> bed -> PRSice pipeline
Disabled --msse4 optimization for window binary as some Windows machine cannot run such binary
For OS X binary, it is noted that machine more than 4 years of age will not be able to run the shipped binary. You will need to compile the software yourself.

Assets 6

04 Jun 21:17

choishingwan

2.2.1

d791b25

Minor Bug fix Pre-release

Pre-release

Update Log

Fix problem with the Rscript
Fix problem with --x-range when no location information were provided in the summary statistic file
Update default of clumping. --clump-kb now reset to 250kb for PRSice and is 1mb for PRSet
Cleaer information regarding the unit for distance parameters (--clump-kb default is kb, --wind-5 and --wind-3 has default of bp). In most case, if you include the unit, PRSice should always be able to recognize it (e.g. B,M,G,BP,MB,GB etc)
Some minor update to the multi-threading w.r.t permutation analysis
When SNP sets are provided, PRSice should now correctly consider PRSet is activated

NOTE

We currently found that --pearson will generate incorrect result. Please refine from using it at the moment.
We have now uploaded the correct version for linux

Warning

We found that --pearson is currently wrong
We also found that bgen doesn't produce the expected results

Assets 6

17 May 14:54

choishingwan

2.2.0

16efff9

Major Release: Better Integration of PRSet

Update Log

General

Standardize command line parameters. For any parameters that act on files other than target, they will contain a prefix of the file name. For example, --base-info will perform INFO score filtering on base file, --ld-info will perform INFO score filtering on the LD reference file and --info will perform INFO score filtering on the target file.
Changed --cov-file and --pheno-file to --cov and --pheno because I am lazy
Removed --se and --prslice because we don't use those options. Might add them back when we introduce new function
Add --id-delim to allow more flexible control of sample ID concatenation
--maf and --ld-maf calculation now restricted in founder similar to PLINK.
Restructured the code to allow easier diagnosis
Add full unit testing for some of the classes, such as Region and SNP. Don't have time for all other classes.
Slight optimization of the GLM algorithm.
Executable for OS X and Linux are now compiled with Intel MKL library, which should provide some speed boost
Fix some of the usage and log messages
Update and reorganized our user manual

Default Changed

Default for --clump-kb changed to --clump-kb 1M from --clump-kb 250K
Default for --lower changed from --lower 0.0001 to --lower 5e-08

PRSet

Add documentation for --wind-3 and --wind-5, which pad each genomic regions at the 3' or 5' end respectively. (was available since 2.1.9, but forgot to provide document)
Combine --snp-set and --snp-sets into --snp-set. PRSet will now automatically detect if the input contain one column (therefore the whole file is one gene set), or if the input contain more than one column (therefore each row is one gene set).
Add documentation for --background. Use --background to specify a background region for competitive p-value calculation
Add parameter of --full-back, which info PRSet to use the whole genome as the background

Note: if --full-back and --background isn't provided, and --gtf and --set-perm is specified, we will use the GTF file to construct the background. If --gtf is missing, then we cannot perform competitive p-value calculation

BGEN

Change --hard-thres and --ld-hard-thres parameter. They are now use to specify the hardcall threshold. i.e. A hardcall is saved when the distance to the nearest hardcall is less than the hardcall threshold. Otherwise a missing code is saved. See out documentation for more information
Add --dose-thres and --ld-dose-thres parameter. They are similar to our old --hard-thres, for any SNPs, if the highest probability of any dosage is less than what's specified in --dose-thres, it will be set as missing.
We have performed manual check. Scores generated from PRSice when --hard is used are now identical to those generated from PLINK. Scores generated using dosage also have high correlation with those generated from --hard.
Support both SNP_ID and RS_ID for BGEN format. If RS_ID not found in base, we will try to match with SNP_ID

Assets 6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Log

Update Log

Update Log

Update Log

2019-08-05

Update Log

2019-07-29

Update Log

Update Log

Update Log

NOTE

Warning

Update Log

General

Default Changed

PRSet

BGEN

Releases: choishingwan/PRSice

MAF calculation

Update Log

PRSet Clumping Fix

Set based permutation speed up, memory mapping and more

Update Log

Bug Fix

Update Log

GLM improvement

Update Log

2019-08-05

BGEN bug fix

Update Log

2019-07-29

Bug Fix

Update Log

Bug Fix

Update Log

Minor Bug fix

Update Log

NOTE

Warning

Major Release: Better Integration of PRSet

Update Log

General

Default Changed

PRSet

BGEN