Data acquisition

250.0 EUR

250.0 EUR peopleperhour Technology & Programming Overseas
119 days ago

Description

We are a software company. For one of our projects we need to downloadinformation from a website containing articles about medical topics.The website contains cca. 10000 HTML pages of paged listing of articlesin Czech language. The list contains titles of articles, each title havinga link to the detail HTML page with the article text.We need someone to produce wget and other scripts and download the titles ofall articles, parse the links from those titles, download the detailed pagesof the articles and distill the text that is shown in the page. The titles as well as the detail pages mostly have the same structure sothis allows for an automated work. But it is not so in 100% cases, there maybe several types of structure so it may require some attention as to howto distill the correct information.The result of this work will be a set of static HTML files. You can view thisstructure underhttps://fomenot.com/z/dwld24/main.htmlI.e. the result will contain the contents of the article separated intoparagraphs of normal text and captions (nothing else, no images or othertexts). We only want the main text of the article that is visible on the screenfor the user. No other text or html content.Another result will be the raw HTML output for each of the detail pagesFor accepting the output, we will do our check of the result. If we find errors,we will give examples of these errors and we will expect the vendor to fixall such errors in the result, not just those examples. If there are only a fewerrors we may not be able to find them and it is ok. But if we find any we willrequire correcting them.We expect that the raw HTML files will be 100% error free (for these we will notgive examples, we just would demand fixing them). For the text-based resultswe will give examples before demanding to fix them.
An example of such a source page you can find here: https://www.idnes.cz/onadnes/zdravi/2You can see a list of articles, each having a link leading to the detailand then a paging control that can load more articles from the next page.This is NOT the page we need to download but similar. Putting here the exampleonly that you understand what is the task.
Let us know if you could do it and for what price. We will provide the real linksto the selected candidate.

关注公众号,不定期副业成功案例分享
Follow WeChat

Success story sharing

Want to stay one step ahead of the latest teleworks?

Subscribe Now

Similar Teleworks

Overview We’re launching a global practitioner directory and client-matching platform for a holistic healing service. The site will be built on WordPress using Elementor Pro and integrate JetEngine, Calendly, and Stripe Connect. We need a developer who can set up the platform, create dynamic practitioner profiles, and implement logic-based client matching. Platform Setup - WordPress install (developer to set up) - Elementor Pro (preferred builder) - Theme (to be selected, compatible with Elementor) Key Plugins / Tools • JetEngine (preferred) or Formidable Forms Pro – for dynamic content, CPTs, and filtering logic • ACF Frontend or JetEngine Frontend Forms – for practitioner dashboard (login/edit) • WPML or Weglot – multi-language support • CURCY – to show pricing in client’s currency (visible only, no conversion) • Stripe Connect + Stripe Instalments option – for direct practitioner payouts (85% to practitioner) • Mailchimp – email lead capture via pop-up Client-Facing Pages • Home (with “Get Matched” logic-based form - CTA + Mailchimp pop-up email catcher - link to e-book) • Children • Animals • 6-Week Lifestyle Plan • Match Me (logic-based form; not a full directory) • About • Contact Note for Developer: Practitioner profiles are created using a Custom Post Type (CPT) and have individual dynamic pages. These are accessed only through the Match Me form results, not through a public directory page. Admin & practitioner can manage these profiles in the backend via login. Practitioner Directory & Booking System - Practitioner profiles built using Custom Post Type (CPT) with a single dynamic Elementor template - Admin-only can create practitioner profiles - Practitioner login with frontend dashboard to view/edit own profile - no backend access - Each profile includes: Photo, full bio, specialties, embedded Calendly connected to Stripe, 3-tier session list, testimonials, toggle for visible/hidden Client Journey – “Match Me” Form 1. Client completes form (language, issue type, specialty, etc.) 2. Logic filters database to show 3 best matches (must include 1 male practitioner) 3. Each result shows thumb photo, short bio, specialties, profile link, and Book Now button 4. Client selects practitioner, views profile, and books via Calendly 5. Non-selected practitioners are re-flagged, adding a new practitioner for future matches Developer Notes: • Pop-up will pull short_bio, name, and photo from the CPT • Click-through button links to the full profile (standard post URL) • JetEngine (or ACF) to fetch both short and full bios dynamically Payment & Currency Handling - Prices shown in client’s local currency using CURCY - Payments processed in practitioner’s native currency via Stripe Connect - Label displayed: “Displayed in your local currency – final payment processed in your practitioner’s currency” Provided Assets • Full content for all pages • Brand kit: logo, fonts, colors, images • CSV file - Practitioner short/full bios, photos, Calendly drop menu categories (Calendly URL links later) • Privacy policy & T&Cs • Video assets (home, children, animals) – in progress • Mailchimp access • Domain + hosting (Hover) Developer Deliverables • Set up WordPress, install and configure all plugins • Build global design system (typography, colors, favicon) • Upload content across all pages • Create Match Me form with conditional logic and filtering Create CPT + dynamic template for practitioner profiles • Build frontend dashboard for practitioner login/edit • Set up Stripe Connect for onboarding split payment link to share across practitonrs & Calendly embeds • Configure CURCY and language switcher • Set up Mailchimp pop-up on homepage to link to free e-book and email catcher • Test complete flow (match → book → pay) • Provide walkthrough video + 6-month bug support Timeline & Budget - Timeline: 4 weeks including testing and revisions - Budget: To be quoted for full multilingual build - Separate quote for PWA phase post launch
1500.0 GBP Technology & Programming peopleperhour Overseas
2 days ago
Description: We are looking for an experienced developer/team to create a centralized review platform similar to trustedshops.de or trustpilot.com. The platform should allow users to view, search, and manage reviews for various online shops. Key Features: Frontend Pages: Home Page: Displays top-rated shops, a search bar, and random positive reviews. Review Detail Page: Detailed view with ratings, stars, comments, and the option to add a review. Search Page: Search for shops by name or domain, showing review summaries. Legal Pages: Imprint, Privacy Policy, Terms of Use. Backend Features: Page Management: Add new pages with placeholders (e.g., [company], [email]). Manage logos and shop information. Central Database: All pages connect to a single, centralized database for easy deployment across multiple domains. Review Generation: Auto-generate positive reviews using ChatGPT API. Reviews must appear realistic, written in German, with names and dates from the last two years. Deployment: Docker for easy setup on different domains API Integration: ChatGPT API for automatic review creation Additional Requirements: The platform must be easily deployable on multiple domains while maintaining a consistent database connection. Backend should allow customization for each domain, including logos and placeholder values. Only positive reviews should be generated with realistic data. Budget: Please specify your rate and estimated timeline. Experience: Please share relevant projects and technical expertise.
600.0 EUR Technology & Programming peopleperhour Overseas
8 hour ago