WhatsApp Web Programmatic Triggers of Certain Operations

60.0 GBP

60.0 GBP peopleperhour Technology & Programming Overseas
345 days ago

Description

OK, here is the spec:
From WhatsApp Web (pre-logged in, via Chrome, Firefox, whatever) - select a predefined conversation - download full message history, store in sensible folder structure(ie, chats/id_of_chat/images, audio, messages) - primary organisation file should ideally be JSON AND CSV (CSV mainly for ease of checking...), with columns:
sender, senttimedate, raw_innerText, raw_text, has_audio, voicenote_codes, has_image, image_codes
senttimedate can maybe ideally be subdivided into time, data... this information is just missing in some instances, as far as I can tell.
raw_innerText I get from doing a element.innerText in javascript on the relevant message "row" (I believe the divs are role="row" or something....
raw_text is the raw text content of the message (if applicable)
has_audio, voicenote_codes, has_image, image_codes pretty self explanatory really, the codes are just to temporarily assign the downloaded media to the relevant message before it is shuffled into the appropriate folder and renamed sensibly... use your discretion.
Images can be trivially easily downloaded as a blob URL is embedded in the HTML
Voice notes I just cannot find any way to do it without simulating user input
This is fine! BUT, I have had a lot of headaches with simulation libraries before, clicknium being fake open source and random server failures at times. Selenium just... syntax seems to be lacking in some obvious ways... Haven't tried Puppeteer, maybe that's an option but here's the thing, if I can just set up the Chrome profiles manually, then use pyautoit and pyautogui (along with some pytesseract) to navigate the browser - plus TamperMonkey to inject some JS... It seems robust and relatively immune to random browser changes breaking specific automation libraries.
I am so close to just finishing this up myself but gods, I'm just wasting so much time I may as well hand it over to someone who can get this done in a sensible way.
Input Options:Conversation to target (may be group or individual - don't need to worry about activating the conversation, this is handled - each will be sorted into a separate folder)Scrape method: COMPLETE / UPDATE ONLY(COMPLETE typically used the first time round, from then on only UPDATES).
It is primarily just important that VOICE NOTES and IMAGES are downloaded and referenced in the output data format I suggested.
Ideally, will be configurable to operate on a loop, iterating through a list of conversations, at first building the data available, then just updating as needed.
Preferably, use python, user-input simulation and whatever browser automation is needed... GUI not strictly necessary as long as it just works.
I know some people use node.js but just because I personally never bothered to get familiar with it would prefer to rule that out unless you can break down setup instructions like I am a 5 year old.
NO APIs, Whatsapp Business account will not be used.

关注公众号,不定期副业成功案例分享
Follow WeChat

Success story sharing

Want to stay one step ahead of the latest teleworks?

Subscribe Now

Similar Teleworks

1. Project Objective Design, build and deploy an AI‑driven LinkedIn workflow that automatically generates and delivers personalised lead magnets to three Ideal Customer Profiles (ICPs)—law‑firm owners, IT consultants, and accountancy‑practice partners—so you increase qualified conversations and booked calls. 2. Business Context Current outreach: Heyreach.io Campaign #1 – re‑engage 1st‑degree connections Heyreach.io Campaign #2 – nurture post reactors Existing lead magnets: Business Growth Quiz, “Create Clarity” Playbook, podcast episodes, case study, ChatGPT‑powered Exit Blueprint. Other tools in stack: LinkedIn Sales Navigator, Apollo.io, Clay, Trigify. Strategic hook: Each ICP advises founders who want to scale and exit but lack a structured exit strategy; lead magnets must surface that pain and position your “scale‑to‑exit framework.” 3. Desired Future Workflow (high‑level) Detection Trigger on new 1st‑degree connection, post reaction, or saved search match. Enrichment & ICP Classification Pull firmographics via Apollo / Clay API (industry, headcount, role). Classify into ICP‑Law, ICP‑IT, ICP‑Accountancy (fallback = “Other”). Personalised Lead‑Magnet Generation Use OpenAI API (GPT‑4o) + prompt templates to create: A bespoke 2‑page PDF mini‑playbook (pain points + quick wins). A unique share‑link to your tailored ChatGPT Exit Blueprint (pre‑loaded with their context). Asset Production Render PDF via serverless doc generator (e.g., Documint, PDFLib, or HTML‑>Puppeteer). Store in S3 / CloudFront; capture public URL. Delivery Sequencing (via Heyreach) Day 0 DM: pattern‑interrupt hook + quiz link. Day 2 Auto‑follow‑up: drop personalised PDF link. Day 5 Nudge: invite to 15‑min “Exit Readiness Review” call. Feedback Loop Sync engagement data back to Airtable/CRM; score lead; auto‑stop on meeting booked. 4. Functional Requirements # Requirement Notes F1 Identify contact ICP via enrichment API and keywords in LinkedIn headline < 300 ms per contact F2 Generate PDF lead magnet with dynamic sections: intro, pain diagnostic, 90‑day action plan, CTA Template must support variable brand colours F3 Create custom GPT blueprint link pre‑seeded with firm name + challenges Use OpenAI “file‑based” system prompt F4 Push asset links + personalised copy back into Heyreach via webhook Must merge‑tag into existing campaign templates F5 Log all events to central datastore (Postgres or Airtable) For reporting & A/B testing 5. Non‑Functional Requirements Security & GDPR: No PII stored outside EU data centres without encryption at rest & TLS 1.2+. Scalability: 2 000 contacts/day with 80 % coverage) Prompt Engineering Library (YAML/JSON templates) CI/CD Pipeline (GitHub Actions) incl. lint, tests, security scan, deploy Admin Dashboard (low‑code Retool) showing pipeline stats & lead downloads Playbook (PDF/Notion) for Marketing Ops: how to create/edit templates 30‑day Hypercare Support 8. Success Metrics (90‑day post‑go‑live) 30 % ↑ in qualified meetings booked from LinkedIn. ≥25 % click‑through on personalised PDFs. Lead magnet production cost ≤ $0.45 per contact (compute + API). System uptime ≥ 99.5 %. 9. Timeline & Milestones Week Milestone 1 Requirements finalised, access to APIs provisioned 2 MVP pipeline: enrichment → content generation → asset storage 3 Heyreach integration & DM templates wired 4 PDF branding polish + dashboard 5 End‑to‑end QA, GDPR review 6 Pilot on 200 contacts 7 Iterate, performance tuning 8 Full launch + hand‑off 10. Risks & Mitigations LinkedIn API/ToS changes: Use headless browser fallback script; monitor rate limits. OpenAI pricing spikes: Add Claude Sonnet or open‑source fallback in abstraction layer. Personalisation fatigue: Rotate prompt templates quarterly; A/B test offers. 11. Next Steps Confirm or adjust scope & KPIs. Provide dev with API keys & brand assets (logo, colours, fonts). Schedule weekly stand‑up cadence and shared Slack channel (doesn't necessarily have to be slack) Must have experience in multiple LLMs
363.0 GBP Technology & Programming peopleperhour Overseas
1 days ago