Skip to content

R tools to download, ingest, and analyze the Phoenix dataset from the Open Event Data Alliance

License

Notifications You must be signed in to change notification settings

ahalterman/phoxy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

phoxy

Download, ingest, and analyze the Phoenix and ICEWS event datasets.

The Phoenix event data set records social and political events automatically extracted from news reporting. phoxy provides functions to make acquiring and analyzing this event data easier. For more information about Phoenix, see its website here: http://phoenixdata.org/ and the website of the Open Event Data Alliance here: http://openeventdata.org/. phoxy also includes basic functionality for working with the ICEWS event dataset.

phoxy, like the Phoenix event dataset, is still under beta development and will change greatly. Bug reports are welcome.

Installation

devtools::install_github("ahalterman/phoxy")

Usage

Currently, phoxy includes the following functions:

  • download_phoenix takes a directory path and version number and downloads the entire Phoenix dataset (of that version) from http://phoenixdata.org/data into the given directory. It currenty accepts two versions: "current" and "v0.1.0".

  • download_icews takes a directory path and downloads the entire ICEWS event dataset from https://dataverse.harvard.edu/dataverse/icews.

  • update_phoenix takes a directory path that contains Phoenix files and downloads any of the daily updates that are missing from the directory. Running it on an empty directory is equivalent to running download_phoenix, and running it daily will make sure you always have yesterday's data downloaded.

  • ingest_phoenix reads in each daily text file from that directory, does some basic column name and class fixing, and combines them all into one large dataframe for easy analysis.

  • ingest_icews reads in a directory of files containing the ICEWS event dataset.

  • convert_cameo will translate from CAMEO event codes to the human readable definitions taken from the CAMEO codebook.

Example:

download_phoenix("~/OEDA/phoxy_test/", phoenix_version = "current")
events <- ingest_phoenix("~/OEDA/phoxy_test/")

> str(events)

Classes ‘tbl_df’, ‘tbl’ and 'data.frame':	699429 obs. of  26 variables:
 $ EventID             : chr  "0_v0.2.0" "1_v0.2.0" "2_v0.2.0" "3_v0.2.0" ...
 $ Date                : Date, format: "2014-06-20" "2014-06-20" "2014-06-20" "2014-06-16" ...
 $ Year                : int  2014 2014 2014 2014 2014 2014 2014 2014 2014 2014 ...
 $ Month               : int  6 6 6 6 6 6 6 6 6 6 ...
 $ Day                 : int  20 20 20 16 20 20 19 20 20 20 ...
 $ SourceActorFull     : chr  "THA" "INDPTY" "ESPCOP" "IMGMUSISI" ...
 $ SourceActorEntity   : chr  "THA" "IND" "ESP" "IMG" ...
 $ SourceActorRole     : chr  "" "PTY" "COP" "" ...
 $ SourceActorAttribute: chr  "" "" "" "MUS;ISI" ...
 $ TargetActorFull     : chr  "USA" "IND" "IMGMUSISI" "TUR" ...
 $ TargetActorEntity   : chr  "USA" "IND" "IMG" "TUR" ...
 $ TargetActorRole     : chr  "" "" "" "" ...
 $ TargetActorAttribute: chr  "" "" "MUS;ISI" "" ...
 $ EventCode           : chr  "043" "010" "010" "130" ...
 $ EventRootCode       : chr  "04" "01" "01" "13" ...
 $ PentaClass           : chr  "1" "0" "0" "3" ...
 $ GoldsteinScore      : num  2.8 0 0 -4.4 0 9 8 0 3.4 7 ...
 $ Issues              : chr  "" "" "SECURITY_SERVICES,1" "NAMED_TERROR_GROUP,1" ...
 $ Lat                 : num  25.5 28.6 40.4 39.9 39.8 ...
 $ Lon                 : num  51.2 77.2 -3.7 32.9 -98.5 ...
 $ LocationName        : chr  "State of Qatar" "New Delhi" "Madrid" "Ankara" ...
 $ StateName           : chr  "" "National Capital Territory of Delhi" "Comunidad de Madrid" "Ankara" ...
 $ CountryCode         : chr  "QAT" "IND" "ESP" "TUR" ...
 $ SentenceID          : chr  "53a492ef421aa9436ba6d1e0_3" "53a49741421aa9435ea6d25e_1" "53a4943e421aa94379a6d2ea_6" "53a492e5421aa94369a6d1c5_5" ...
 $ URLs                : chr  "http://www.theguardian.com/global-development/2014/jun/20/thailand-qatar-downgraded-human-trafficking-report" "http://www.thehindu.com/news/national/hindi-issue-cong-advises-caution-says-backlash-in-nonhindi-states/article6133644.ece?utm_"| __truncated__ "http://rss.nytimes.com/c/34625/f/640334/s/3b920a70/sc/11/l/0L0Snytimes0N0C20A140C0A60C170Cworld0Ceurope0Cspanish0Epolice0Etarge"| __truncated__ "http://www.ipsnews.net/2014/06/turkeys-kurdish-problem-likely-worsen-isis-gains-iraq/?utm_source=rss&utm_medium=rss&utm_campaig"| __truncated__ ...
 $ NewsSources         : chr  "guardian_americas" "hindu_nat" "nyteurope" "ips_hr" ...

About

R tools to download, ingest, and analyze the Phoenix dataset from the Open Event Data Alliance

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages