Project Proposal: Determining Online Article Accuracy compared to Link Title. [Clickbait detection)

I am interested in determining if there is a way of predicting if an online article link is clickbait based on the structure and usage of the title.

This is interesting to me because learning to sift through the deluge of information in the modern world is rapidly becoming a required skill and determining truth in phrasing is important in both artifical intelligence as well as maximizing a person's precious time.

I would like to know if:

There is a particular pattern used to mask clickbat - this implies that we as humans have a pre-desposition to certain communication patterns.
If article sources can be assigned a co-efficient, implying that we can rank article sources by "trustworthiness"
Which cognitive distortions/logical fallicies are most exploited in the creation of clickbait.

To do this, we need three groups of data sources:

Truth control: This is a group of article sources that are considered to be telling the truth
Liar control: This is a group of articles sources that are considered to be misrepresenting the truth
Variable: This is the group of article sources that we will be comparing against the prior two to determine their "trustworthiness"

For the truth control group I intend to use articles from:

The New York Times Article Search and Most Popular Listings
The US Governments Open Data portal Climate Reports
Wikipedia Page Traffic Statistics From the last 3 months
Reddit's Not The Onion Subreddit

For the variable group, I will compare and analyze:

Mashable
UpWorthy
Vox
BBC

For the liar control group I inted to use articles from:

The Onion Politics Section
Reddit's Fake News Subreddit
The Daily Currant
The National Report

Further sources can be taken from this list on Fake News Watch

Consideration is also made for Comparing Kickstarter pitches to their article content, against their success: http://webrobots.io/kickstarter-datasets/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dyau_project_idea.md

dyau_project_idea.md

Project Proposal: Determining Online Article Accuracy compared to Link Title. [Clickbait detection)

Files

dyau_project_idea.md

Latest commit

History

dyau_project_idea.md

File metadata and controls

Project Proposal: Determining Online Article Accuracy compared to Link Title. [Clickbait detection)