蹤獲扦夥厙

Webtext2Insight

Introduction

Workshop on Applied Web Scraping, NLP, and GenAI for Business & Research, University of Leeds, 9-14 Jun 2025

INTEREST CATEGORY: MARKETING RESEARCH
POSTING TYPE: Events

Posted by: Barbara Summers


Webtext2Insight

An introductory workshop to Applied Web Scraping, NLP, and GenAI for Business & Research

The Centre for Decision Research at the University of Leeds is running an in-person introductory workshop on webscraping, natural language processing, and generative AI for research and business in June. It includes an optional introduction to Python. For more details see below. We hope to see some of you there!

Course Overview

The web is a vast resource of valuable data. How can businesses and researchers extract insights from it? Do people leave online reviews of your products you would like to use? Do people talk about your research topic online? Do you wonder how the sentiment of policy language on a topic is changing? This four-day in-person workshop provides hands-on training in web scraping, natural language processing (NLP), and generative AI using state-of-the-art Python tools.

Programming Languages: Python (AI tools: Hugging Face, OpenAI API)

After this course, you will be able to:

Extract web data efficiently, while understanding the legal and ethical issues.

Analyse and interpret text using NLP techniques such as Topic Modelling.

Leverage sentence embeddings for Text Classification.

Use generative AI (GPT models) for content creation, automation, and decision-making.

Who Should Attend?

This course is ideal for:

  • Researchers and academics working with text data.
  • Business analysts and consultants looking to apply AI-driven insights.
  • Data scientists and professionals interested in web data extraction.

Requirements

Some exposure to programming (not necessarily Python) is beneficial.

If you are a beginner, let us know, so we can send you materials to get a head-start.

Course Structure

Monday 9 June – Day 1: Introduction to Python (Optional)

  • Learn how to complete programming tasks in Google Colab.
  • Learn Python syntax, including functions, loops, and error handling.

Tuesday 10 June – Day 2: Web Scraping

  • Learn how to extract data from websites ethically and efficiently.
  • Work with HTML, CSS, and APIs to collect structured data.
  • Hands-on project: Scraping real-world data.

Wednesday 11 June – Day 3: Natural Language Processing

  • Learn key NLP techniques: tokenisation, sentiment analysis, and topic modelling
  • Hands-on project: Apply topic modelling to derive insights from real data.

Thursday 12 June – Day 4: Embeddings & Generative AI for Business & Research

  • Use Hugging Faces transformers for text embeddings and basic tasks like Named Entity Recognition.
  • Explore OpenAIs GPT models for content generation and automation.
  • Hands-on project: Automatically code information from online reviews.

Why Take This Course?

Practical & Hands-On: Learn by doing with real-world datasets.

Cutting-Edge Techniques: Stay ahead with the latest AI and data analytics tools.

兩 Expert Instructors: Lukasz Walasek (Warwick), Xingjie Wei (Leeds), Simon van Baal (Leeds).

Networking Opportunities: Connect with like-minded professionals and researchers.

Seats are limited! Register now to secure your spot

.