Summarizing the latest Spotify releases with ChatGPT

Author:Murphy | View: 23514 | Time: 2025-03-23 19:20:38

In today's fast-paced world, natural language processing (NLP) has become a crucial component in a wide array of applications. Large models, such as OpenAI's ChatGPT and GPT-4, have unlocked incredible potential for tasks like summarization, speech-to-text, speech recognition, semantic search, question answering, chatbots, and more.

I am excited to announce "Large Language Models Chronicles: Navigating the NLP Frontier" a new weekly series of articles that will explore how to leverage the power of large models for various NLP tasks. By diving into these cutting-edge technologies, we aim to empower developers, researchers, and enthusiasts to harness the potential of NLP and unlock new possibilities.

In this first article of the series, we will focus on using OpenAI's ChatGPT and the Spotify API to create an intelligent summarization system for the latest music releases. As the series unfolds, we will delve into a multitude of NLP applications, providing insights, techniques, and practical examples that demonstrate the prowess of large models in transforming the way we interact with and understand language.

Stay tuned for more articles as we embark on this exciting journey through the world of NLP, guiding you through the process of mastering diverse language tasks with state-of-the-art large models.

Figure 1: Are LLMs beginning the new cooperation between men and machines? (source)

The code is available on my Github.

Introduction

ChatGPT and GPT-4, developed by OpenAI, are state-of-the-art language model that have shown incredible proficiency in various natural language processing tasks. They have the ability to understand context, generate human-like responses, and even summarize large chunks of text effectively. This makes them an ideal tool for summarizing the latest music releases on Spotify.

Spotify, a leading music streaming platform, offers an extensive API that provides developers access to a vast amount of music data, including new releases, playlists, and much more. By combining ChatGPT's powerful language understanding capabilities with the rich music data available through the Spotify API, we can build a system that keeps you informed about the latest additions to the Spotify catalog.

We will walk you through the process of building this intelligent music summarization system. Our approach will consist of the following steps:

Accessing the Spotify API: We'll start by fetching data about the latest music releases using the Spotify API.
Summarizing with ChatGPT: Then, we'll use OpenAI's API to generate a concise summary of the latest releases.
Results: Finally, we'll present the summary in an easily readable and engaging format.

Stay tuned as we delve into the details of each step, empowering you to create your very own music summarization tool!

Accessing the Spotify API

In this section, we will explore how to fetch the latest music releases and their associated track data from the Spotify API. We will then save this data to a JSON file for further processing. The following Python functions will be used to achieve this goal:

get_new_releases: Fetch new album releases from Spotify.
get_album_tracks: Retrieve track information for a specific album.
save_data_to_file: Save the fetched data to a JSON file.
load_data_from_file: Load the saved data from the JSON file.
download_latest_albums_data: Download the latest albums and tracks data from Spotify and save it to a JSON file.

Let's break down the key components of these functions and understand how they work together to access the Spotify API.

Fetching New Releases

The get_new_releases function takes two optional arguments, limit and offset. limit determines the maximum number of album results to return, while offset specifies the index of the first result. By default, limit is set to 50 and offset to 0. The function then calls sp.new_releases from the Spotify API, which returns a dictionary containing album information. The relevant album items are extracted and returned as a list of dictionaries.

def get_new_releases(limit: int = 50, offset: int = 0) -> List[Dict[str, Any]]:
    """
    Fetch new releases from Spotify.

    Args:
        limit (int, optional): Maximum number of album results to return. Defaults to 50.
        offset (int, optional): The index of the first result to return. Defaults to 0.

    Returns:
        List[Dict[str, Any]]: A list of dictionaries containing album information.
    """
    new_releases = sp.new_releases(limit=limit, offset=offset)
    albums = new_releases["albums"]["items"]
    return albums

Retrieving Album Tracks

The get_album_tracks function accepts a single argument, album_id, which is the Spotify ID of the album for which we want to fetch track information. The function calls sp.album_tracks from the Spotify API, which returns a dictionary containing track data. The track items are then extracted and returned as a list of dictionaries.

def get_album_tracks(album_id: str) -> List[Dict[str, Any]]:
    """
    Fetch tracks from a specific album.

    Args:
        album_id (str): The Spotify ID of the album.

    Returns:
        List[Dict[str, Any]]: A list of dictionaries containing track information.
    """
    tracks = sp.album_tracks(album_id)["items"]
    return tracks

Saving and Loading Data

The save_data_to_file function takes two arguments: data, which is a list of dictionaries containing album and track information, and file_path, which is the path to the JSON file where the data will be saved. The function writes the data to the specified file using the json.dump method.

Conversely, the load_data_from_file function reads the data from the specified JSON file and returns it as a list of dictionaries using the json.load method.

def save_data_to_file(data: List[Dict[str, Any]], file_path: str) -> None:
    """
    Save data to a JSON file.

    Args:
        data (List[Dict[str, Any]]): List of dictionaries containing album and track information.
        file_path (str): Path to the JSON file where the data will be saved.
    """
    with open(file_path, "w", encoding="utf-8") as file:
        json.dump(data, file, ensure_ascii=False, indent=4)

def load_data_from_file(file_path: str) -> List[Dict[str, Any]]:
    """
    Load data from a JSON file.

    Args:
        file_path (str): Path to the JSON file where the data is stored.

    Returns:
        List[Dict[str, Any]]: List of dictionaries containing album and track information.
    """
    with open(file_path, "r", encoding="utf-8") as file:
        return json.load(file)

Downloading Latest Albums Data

The download_latest_albums_data function serves as the main driver for downloading the latest album and track data from Spotify. It initializes variables such as limit, offset, total_albums, album_count, and an empty list all_albums to store the fetched data.

The function then enters a loop that continues until the specified number of albums (total_albums) have been fetched. In each iteration, the function calls get_new_releases and get_album_tracks to retrieve the album and track information. This data is then stored in the all_albums list.

After fetching the data, the function increments the offset by the limit value to fetch the next set of albums in the subsequent iteration. A one-second delay is added to avoid hitting the Spotify API rate limit. The function finally calls save_data_to_file to store the fetched data in a JSON file.

def download_latest_albums_data() -> None:
    """
    Download the latest albums and tracks data from Spotify and save it to a JSON file.
    """
    limit = 50
    offset = 0
    total_albums = 30
    album_count = 0

    all_albums = []

    while total_albums is None or album_count < total_albums:
        new_releases = get_new_releases(limit, offset)
        if total_albums is None:
            total_albums = sp.new_releases()["albums"]["total"]

        for album in new_releases:
            album_info = {
                "album_name": album["name"],
                "artist_name": album["artists"][0]["name"],
                "album_type": album["album_type"],
                "release_date": album["release_date"],
                "available_markets": album["available_markets"],
                "tracks": [],
            }

            tracks = get_album_tracks(album["id"])

            for track in tracks:
                track_info = {
                    "track_name": track["name"],
                    "duration_ms": track["duration_ms"],
                    "preview_url": track["preview_url"],
                }
                album_info["tracks"].append(track_info)

            all_albums.append(album_info)
            album_count += 1

        offset += limit
        time.sleep(1)  # Add a delay to avoid hitting the rate limit
        print(f"Downloaded {album_count}/{total_albums}")

    save_data_to_file(all_albums, "albums_and_tracks.json")

By using these functions, we can effectively access the Spotify API to gather data about the latest music releases. In the next section, we will explore how to preprocess this data and use ChatGPT to generate a summary of these new releases.

Summarizing with ChatGPT using LangChain

In this section, we will discuss how to preprocess the album and track data obtained from the Spotify API and use ChatGPT to generate a summary of the latest music releases with the help of the LangChain library. LangChain is a powerful tool that enables developers to build applications that combine LLMs with other sources of computation or knowledge.

We will use the following Python functions to achieve this:

preprocess_docs: Convert the JSON data to a list of Document objects.
get_summary: Generate a summary using the JSON data provided in the list of Document objects.

Preprocessing the Data

The preprocess_docs function accepts a list of dictionaries containing album and track information, which is the data we retrieved from the Spotify API. The function converts this data into a JSON string and then splits it into 3500-character segments. These segments are used to create a list of Document objects, which will be passed to ChatGPT for summary generation.

The reason for splitting the data into smaller segments is to handle the text length limitations imposed by the ChatGPT API. By breaking the text into smaller pieces, we can process the data more efficiently without exceeding the model's maximum token limit.

def preprocess_docs(data: List[Dict[str, Any]]) -> List[Document]:
    """
    Convert the JSON data to a list of Document objects.

    Args:
        data (List[Dict[str, Any]]): List of dictionaries containing album and track information.

    Returns:
        List[Document]: A list of Document objects containing the JSON data as strings, split into 3000-character segments.
    """
    json_string = json.dumps(data, ensure_ascii=False, indent=4)
    doc_splits = [json_string[i : i + 3500] for i in range(0, len(json_string), 3500)]
    docs = [Document(page_content=split_text) for split_text in doc_splits]
    return docs

Generating a Summary with ChatGPT using LangChain

LangChain's CombineDocuments chains are designed to process and combine information from multiple documents, making them ideal for tasks like summarization and question answering. In our case, we'll focus on the Map Reduce method to generate a summary of the latest Spotify releases using ChatGPT. You can easily use GPT-4 if you already have access to the API. For that, you just need to update the model_nameargument passed to theChatOpenAIclass.

The Map Reduce method works by running an initial prompt on each chunk of data, generating an output for each. For instance, in a summarization task, this would involve creating a summary for each individual chunk. In the next step, a different prompt is run to combine all these initial outputs into a single, coherent output.

The main advantages of using the Map Reduce method are that it can scale to larger documents and handle more documents than the Stuffing method. Additionally, the calls to the LLM for individual documents are independent, allowing for parallelization and faster processing.

In the context of our project, we'll apply the Map Reduce method to summarize the latest Spotify releases using ChatGPT. We use the Map Reduce method to generate summaries for each of these documents and subsequently combine them into a single, concise summary.

def get_summary(docs: List[Document]) -> str:
    """
    Generate a summary using the JSON data provided in the list of Document objects.

    Args:
        docs (List[Document]): A list of Document objects containing the JSON data as strings.

    Returns:
        str: The generated summary.
    """
    llm = ChatOpenAI(temperature=0, model_name="gpt-3.5-turbo")

    prompt_template = """Write a short summary about the latest songs in Spotify based on the JSON data below: nn{text}."""
    prompt_template2 = """Write an article about the latest music released in Spotify (below) and adress the change in music trends using the style of Rick Beato. : nn{text}"""

    PROMPT = PromptTemplate(template=prompt_template, input_variables=["text"])
    PROMPT2 = PromptTemplate(template=prompt_template2, input_variables=["text"])

    chain = load_summarize_chain(
        llm,
        chain_type="map_reduce",
        return_intermediate_steps=True,
        map_prompt=PROMPT,
        combine_prompt=PROMPT2,
        verbose=True,
    )

    res = chain({"input_documents": docs}, return_only_outputs=True)

    return res

Results

To provide a better understanding of the summarization capabilities of our ChatGPT implementation using the Spotify API and OpenAI API, we will showcase an example that demonstrates how the system processes data and generates a concise summary. Let's examine the input data, intermediate steps, and final output.

Input Data

The input data consists of several albums and their corresponding tracks, such as "Oitavo Céu" by Dillaz and "CASTANHO" by T-Rex. Each album includes details like the album name, artist name, album type, release date, and a list of tracks with their names and durations in milliseconds.

Intermediate Steps

The intermediate steps involve processing the input data using the MapReduce method. For instance, the following is a summary generated for a part of the input data:

The latest songs in Spotify include tracks from three new albums: "Oitavo Céu" by Dillaz, "CASTANHO" by T-Rex, and "OBG" by Branko. "Oitavo Céu" features 12 tracks, including the title track and "Maçã" which has the longest duration at 219130 ms. "CASTANHO" has 11 tracks, with "LADO NENHUM" having the longest duration at 278190 ms. "OBG" has 10 tracks, with "ETA" having the longest duration at 226058 ms. All three albums were released in April 2022.

Final Output

The final output combines the summaries from the intermediate steps and provides a coherent, concise overview of the latest releases on Spotify:

In recent years, the music industry has seen a significant shift in trends, with the rise of streaming platforms like Spotify and the increasing popularity of genres like hip-hop and electronic dance music (EDM). As a result, the latest music released on Spotify reflects these changes, with a diverse range of artists and genres represented.

One notable trend is the increasing prominence of Portuguese-language music, with albums like "Oitavo Céu" by Dillaz and "CASTANHO" by T-Rex featuring prominently on the platform. These albums showcase the unique sounds and rhythms of Portuguese music, blending traditional styles with modern influences.

Another trend is the growing popularity of collaborations between artists from different genres and backgrounds. Albums like "Life of a DON" by Don Toliver and "Faith" by Pop Smoke feature collaborations with a range of artists, from Travis Scott and Kanye West to Rick Ross and Lil Tjay. These collaborations allow artists to explore new sounds and styles, and create music that appeals to a wider audience.

In addition, the latest music on Spotify reflects the increasing diversity of musical styles and influences. Albums like "SOUR" by Olivia Rodrigo and "Happier Than Ever" by Billie Eilish showcase the unique voices and perspectives of young female artists, offering a fresh take on contemporary music.

As demonstrated, our ChatGPT-based system effectively summarizes the latest Spotify releases, providing an accessible and engaging overview for music enthusiasts to stay informed and discover new content.

Conclusions

In this article, we demonstrated the power of combining Spotify API and OpenAI's ChatGPT to create a summarization system that keeps you up-to-date with the latest music releases. We discussed document chaining techniques, selected the MapReduce method for its scalability, and showcased the effectiveness of our system in producing coherent and informative summaries.

The synergy between AI-powered language models and APIs from popular platforms like Spotify opens up new opportunities for innovation and personalization. As AI technologies continue to evolve, their applications in various NLP tasks will only expand, offering exciting ways to enhance our daily lives.

In conclusion, our exploration serves as an inspiring example of the potential of cutting-edge AI technologies in solving real-world challenges and creating valuable experiences for users. We hope this article encourages you to explore further applications of AI in your own projects and inspire you to create innovative solutions that make a difference.

Keep in touch: LinkedIn

Tags: AI ChatGPT Gpt 4 Machine Learning Python