[ad_1]
Recommendation Systems
In the domain of AI Recommendation Systems, Machine Learning models have been heavily used to recommend similar samples, whether products, content or even suggesting similar contacts. Most of these pre-trained models are open-source and can be used without training a model from scratch. However, with the lack of Big Data, there is no open-source technology we can rely on for the recommendation of complementary products.
In the following article, I am proposing a framework (with code in the form of a user-friendly library) that exploits LLM for the discovery of complementary products in a non-expensive way.
My goal for introducing this framework is for it to be:
- Scalable
It is a framework that should not require supervision when running, that does not risk breaking, and the output should be easily structured to be used in combination with additional tools. - Affordable
It should be affordable to find the complementary of thousands of products with minimum spending (approx. 1 USD per 1000 computed products — using groq pricing), in addition, without requiring any fine-tuning (this means that it could even be tested on a single product).
***Full zeroCPR code is open-source and available at my Github repo, feel free to contact me for support or feature requests. In this article, I am introducing both the framework (and its respective library) zeroCPR and a new prompting technique that I call Chain-of-DataFrame for list reasoning.
Before digging into the theory of the zeroCPR framework, let us understand why current technology is limited in this very domain:
Why do neural networks excel at recommending similar products?
These models excel at this task because neural networks innately group samples with common features in the same space region. To simplify, if, for example, a neural network is trained on top of the human language, it will allocate in the same space region words or sentences that have similar meanings. Following the same principle, if trained on top of customer behavior, customers sharing similar behavior will be arranged in similar space regions.
The models capable of recommending similar sentences are called semantic models, and they are both light and accessible, allowing the creation of recommendation systems that rely on language similarity rather than customer behavior.
A retail company that lacks customer data can easily recommend similar products by exploiting the capabilities of a semantic model.
What about complementary products?
However, recommending complementary products is a totally different task. To my knowledge, no open-source model is capable of performing such an enterprise. Retail companies train their custom complementary recommender systems based on their data, resulting in models that are difficult to generalize, and that are industry-specific.
zeroCPR stands for zero-shot complementary product recommender. The functioning is simple. By receiving a list of your available products and reference products, it tried to find if in your list there are complementary products that can be recommended.
Large Language Models can easily recommend complementary products. You can ask ChatGPT to output what products can be paired with a toothbrush, and it will likely recommend dental floss and toothpaste.
However, my goal is to create an enterprise-grade tool that can work with our custom data. ChatGPT may be correct, but it is generating an unstructured output that cannot be integrated with our list of products.
The zeroCPR framework can be outlined as follows, where we apply the following 3 steps for each product in our product list:
1. List complementary products
As explained, the first bottleneck to solve is finding actual complementary products. Because similarity models are out of the question, we need to use a LLM. The execution of the first step is quite simple. Given an input product (ex. Coca-Cola), produce a list of valid complementary products a user may purchase with it.
I have asked the LLM to output a perfectly parsable list using Python: once parsed, we can visualize the output.
The results are not bad at all: these are all products that are likely to be purchased in pairs with Coca-Cola. There is, however, a small issue: THESE PRODUCTS MAY NOT BE IN OUR DATA.
2. Matching the available products in our data
The next step is trying to match every complementary product suggested by the LLM with a corresponding product in our dataset. For example, we want to match “Nachos” with the closest possible product in our dataset.
We can perform this matching using vector search. For each LLM product, we will match it with the most semantically similar in our dataset.
As we can see, the results are far from accurate. “Nachos” will be matched with “SET OF SALT AND PEPPER TOADSTOOLS”, while the closest match with “Burgers” is “S/2 BEACH HUT STOOLS”. Some of the matches are valid (we can look at Napkins), but if there are no valid matches, a semantic search will still fit it with an irrelevant candidate. Using a cosine similarity threshold is, by experience, a terrible method for selecting valid choices. Instead, I will use an LLM again to validate the data.
3. Select correct complements using Chain-of-DataFrame
The goal is now to validate the matching of the previous step. My first attempts to match the products recommended by an LLM were frustrated by the lack of coherence in the output. Though being a 70B model, when I was passing in the prompt a list of products to match, the output was less than desirable (with combinations of errors in the formatting and highly unrealistic output).
However, I have noticed that by inputting a list of products and asking the model to reason on each sample and output a score (0 or 1): (following the format of a pandas dataframe and applying a transformation to a single column), the model is much more reliable (in terms of format and output). I call this prompting paradigm Chain-of-Dataframe, in reference to the well-known pandas data structure:
To give you an idea of the Chain-of-Dataframe prompting. In the following example, the {product_name} is coca-cola, while the {complementary_list} is the column called recommended_product we can see in the image below:
A customer is doing shopping and buys the following product
product_name: {product_name}A junior shopping expert recommends the following products to be bought together, however he still has to learn:
given the following
complementary_list: {complementary_list}
Output a parsable python list using python, no comments or extra text, in the following format:
[
[<product_name 1>, <reason why it is complementary or not>, <0 or 1>],
[<product_name 2>, <reason why it is complementary or not>, <0 or 1>],
[<product_name 3>, <reason why it is complementary or not>, <0 or 1>],
...
]
the customer is only interested in **products that can be paired with the existing one** to enrich his experience, not substitutes
THE ORDER OF THE OUTPUT MUST EQUAL THE ORDER OF ELEMENTS IN complementary_list
Take it easy, take a big breath to relax and be accurate. Output must start with [, end with ], no extra text
The output is a multidimensional list that can be parsed easily and immediately converted again into a pandas dataframe.
Notice the reasoning and score columns generated by the model to find the best complementary products. With this last step, we have been able to filter out most of the irrelevant matches.
***The algorithm may look similar to CHAIN-OF-TABLE: EVOLVING TABLES IN THE REASONING CHAIN FOR TABLE UNDERSTANDING, but I deem the one proposed above is much simpler and uses a different structure. Feel free to comment if you think otherwise.
4. Dealing with little data: Nearest Substitute Filling
There is one last issue we need to address. It is likely that, due to the lack of data, the number of recommended products is minimal. In the example above, we can recommend 6 complementary products, but there might be cases where we can only recommend 2 or 3. How can we improve the user experience, and expand the number of valid recommendations, given the limitations imposed by our data?
[ad_2]
Source link