A few notes around domain discovery of Project Eudaimonia:
- Keeping in translated sites for two reasons: (1) keeping in content helps WP: translated sites may have different content which may be inaccessible; (2) keeping in content helps Equalify: the more data, the better we can tune our systems!
- Allowing externally hosted sites for now. Eg: sites like https://wordpress.github.io/wordpress-playground/
- We'll both crawl and sitemap scan sites, as per Matt's request