Skip to content

Latest commit

 

History

History
42 lines (29 loc) · 2.52 KB

dyau_project_idea.md

File metadata and controls

42 lines (29 loc) · 2.52 KB

Project Proposal: Determining Online Article Accuracy compared to Link Title. [Clickbait detection)

I am interested in determining if there is a way of predicting if an online article link is clickbait based on the structure and usage of the title.

This is interesting to me because learning to sift through the deluge of information in the modern world is rapidly becoming a required skill and determining truth in phrasing is important in both artifical intelligence as well as maximizing a person's precious time.

I would like to know if:

  1. There is a particular pattern used to mask clickbat - this implies that we as humans have a pre-desposition to certain communication patterns.
  2. If article sources can be assigned a co-efficient, implying that we can rank article sources by "trustworthiness"
  3. Which cognitive distortions/logical fallicies are most exploited in the creation of clickbait.

To do this, we need three groups of data sources:

  1. Truth control: This is a group of article sources that are considered to be telling the truth
  2. Liar control: This is a group of articles sources that are considered to be misrepresenting the truth
  3. Variable: This is the group of article sources that we will be comparing against the prior two to determine their "trustworthiness"

For the truth control group I intend to use articles from:

For the variable group, I will compare and analyze:

For the liar control group I inted to use articles from:

Further sources can be taken from this list on Fake News Watch

Consideration is also made for Comparing Kickstarter pitches to their article content, against their success: http://webrobots.io/kickstarter-datasets/