can't get full content post in pulic group #165

LiuRaink · 2021-02-19T08:48:14Z

Hi all !
I come from Vietnam.
Currently I run this facebook-scaper program to get posts from public group, but there are some long posts that are not shown.

groupid: lienminhnongnghieptute

LiuRaink · 2021-02-26T01:36:02Z

I edited the post_story_regex section to get the group's long posts
post_story_regex = re.compile(r'(groups/id-group/permalink/[0-9]+)')

Tim1702 · 2021-04-06T18:20:19Z

@LiuSiu Change that line from extractors.py right? Is it still working for you? Coz it's not working for me.

neon-ninja · 2021-04-07T02:09:10Z

https://m.facebook.com/groups/lienminhnongnghieptute/permalink/819349405312820/ seems to be specific post you're having a problem with

The python code:

posts = list(get_posts(post_urls=["https://m.facebook.com/groups/lienminhnongnghieptute/permalink/819349405312820/"]))
for post in posts:
    print(post.get("post_id"), post.get("time"), len(post.get("text")))

returns

819349405312820 2021-02-18 15:58:08 1807

Are you sure your spreadsheet program (Excel?) isn't suppressing the rest of the output? Maybe try look at the CSV in a regular text editor

LiuRaink · 2021-04-07T02:25:46Z

@Tim1702 you try change 2 line code
post_story_regex = re.compile(r'(groups/yougroupid/permalink/[0-9]+)')
url = utils.urljoin(FB_MOBILE_BASE_URL, match.groups()[0])

LiuRaink · 2021-04-07T02:27:20Z

@neon-ninja Thank you, I can get all post with change 2 line code
post_story_regex = re.compile(r'(groups/yougroupid/permalink/[0-9]+)')
url = utils.urljoin(FB_MOBILE_BASE_URL, match.groups()[0])

itachi1988 · 2021-04-15T04:03:25Z

same issue, cant get full post from page public

neon-ninja · 2021-04-15T05:24:01Z

@itachi1988 what page, what post?

itachi1988 · 2021-04-15T08:27:18Z

@itachi1988 what page, what post?

Thank for reply.
1 link examples: https://www.facebook.com/voatiengviet/posts/10158240181443008,
for post in get_posts('voatiengviet/posts/10158240181443008'):
print(post['text'])
I run same script with 2 different pc, 2 results: 1 full and 1 part_content........More, not get fullcontent.
Thanks

neon-ninja · 2021-04-16T04:59:24Z

@itachi1988 what page, what post?

Thank for reply.
1 link examples: https://www.facebook.com/voatiengviet/posts/10158240181443008,
for post in get_posts('voatiengviet/posts/10158240181443008'):
print(post['text'])
I run same script with 2 different pc, 2 results: 1 full and 1 part_content........More, not get fullcontent.
Thanks

for post in get_posts('voatiengviet/posts/10158240181443008'):
    print(post.get("post_id"), post.get("time"), len(post.get("text")))

returns

10158240181443008 2021-04-15 19:00:02 8686
10158242126303008 2021-04-16 14:08:36 305
10158241437138008 2021-04-16 13:30:01 996
10158241486168008 2021-04-16 13:00:01 366

On my machine. If you get a different result, perhaps try passing in cookies?

kevinzg added the bug Something isn't working label Mar 8, 2021

This was referenced Apr 21, 2021

Extract direct link for image and posts #213

Open

regexes for GroupPostExtractor. support group posts in get_posts_by_url #216

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

can't get full content post in pulic group #165

can't get full content post in pulic group #165

LiuRaink commented Feb 19, 2021

LiuRaink commented Feb 26, 2021 •

edited

Loading

Tim1702 commented Apr 6, 2021

neon-ninja commented Apr 7, 2021

LiuRaink commented Apr 7, 2021

LiuRaink commented Apr 7, 2021

itachi1988 commented Apr 15, 2021

neon-ninja commented Apr 15, 2021

itachi1988 commented Apr 15, 2021 •

edited

Loading

neon-ninja commented Apr 16, 2021

can't get full content post in pulic group #165

can't get full content post in pulic group #165

Comments

LiuRaink commented Feb 19, 2021

LiuRaink commented Feb 26, 2021 • edited Loading

Tim1702 commented Apr 6, 2021

neon-ninja commented Apr 7, 2021

LiuRaink commented Apr 7, 2021

LiuRaink commented Apr 7, 2021

itachi1988 commented Apr 15, 2021

neon-ninja commented Apr 15, 2021

itachi1988 commented Apr 15, 2021 • edited Loading

neon-ninja commented Apr 16, 2021

LiuRaink commented Feb 26, 2021 •

edited

Loading

itachi1988 commented Apr 15, 2021 •

edited

Loading