Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can't get full content post in pulic group #165

Open
LiuRaink opened this issue Feb 19, 2021 · 9 comments
Open

can't get full content post in pulic group #165

LiuRaink opened this issue Feb 19, 2021 · 9 comments
Labels
bug Something isn't working

Comments

@LiuRaink
Copy link

Hi all !
I come from Vietnam.
Currently I run this facebook-scaper program to get posts from public group, but there are some long posts that are not shown.

groupid: lienminhnongnghieptute

2021-02-19 15_46_36-lienminhnongnghieptute - Excel

@LiuRaink
Copy link
Author

LiuRaink commented Feb 26, 2021

I edited the post_story_regex section to get the group's long posts
post_story_regex = re.compile(r'(groups/id-group/permalink/[0-9]+)')

@kevinzg kevinzg added the bug Something isn't working label Mar 8, 2021
@Tim1702
Copy link

Tim1702 commented Apr 6, 2021

@LiuSiu Change that line from extractors.py right? Is it still working for you? Coz it's not working for me.

@neon-ninja
Copy link
Collaborator

https://m.facebook.com/groups/lienminhnongnghieptute/permalink/819349405312820/ seems to be specific post you're having a problem with

The python code:

posts = list(get_posts(post_urls=["https://m.facebook.com/groups/lienminhnongnghieptute/permalink/819349405312820/"]))
for post in posts:
    print(post.get("post_id"), post.get("time"), len(post.get("text")))

returns

819349405312820 2021-02-18 15:58:08 1807

Are you sure your spreadsheet program (Excel?) isn't suppressing the rest of the output? Maybe try look at the CSV in a regular text editor

@LiuRaink
Copy link
Author

LiuRaink commented Apr 7, 2021

@Tim1702 you try change 2 line code
post_story_regex = re.compile(r'(groups/yougroupid/permalink/[0-9]+)')
url = utils.urljoin(FB_MOBILE_BASE_URL, match.groups()[0])

@LiuRaink
Copy link
Author

LiuRaink commented Apr 7, 2021

@neon-ninja Thank you, I can get all post with change 2 line code
post_story_regex = re.compile(r'(groups/yougroupid/permalink/[0-9]+)')
url = utils.urljoin(FB_MOBILE_BASE_URL, match.groups()[0])

@itachi1988
Copy link

same issue, cant get full post from page public

@neon-ninja
Copy link
Collaborator

@itachi1988 what page, what post?

@itachi1988
Copy link

itachi1988 commented Apr 15, 2021

@itachi1988 what page, what post?

Thank for reply.
1 link examples: https://www.facebook.com/voatiengviet/posts/10158240181443008,
for post in get_posts('voatiengviet/posts/10158240181443008'):
print(post['text'])
I run same script with 2 different pc, 2 results: 1 full and 1 part_content........More, not get fullcontent.
Thanks

@neon-ninja
Copy link
Collaborator

@itachi1988 what page, what post?

Thank for reply.
1 link examples: https://www.facebook.com/voatiengviet/posts/10158240181443008,
for post in get_posts('voatiengviet/posts/10158240181443008'):
print(post['text'])
I run same script with 2 different pc, 2 results: 1 full and 1 part_content........More, not get fullcontent.
Thanks

for post in get_posts('voatiengviet/posts/10158240181443008'):
    print(post.get("post_id"), post.get("time"), len(post.get("text")))

returns

10158240181443008 2021-04-15 19:00:02 8686
10158242126303008 2021-04-16 14:08:36 305
10158241437138008 2021-04-16 13:30:01 996
10158241486168008 2021-04-16 13:00:01 366

On my machine. If you get a different result, perhaps try passing in cookies?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants