I built an AI agent that scrapes leads for a given niche directly from Google

    Shared 11/14/2025

    10 views

    Visual Workflow

    JSON Code

    {
      "meta": {
        "instanceId": "e1b05efeff19931360256028fd81f79b3b01be411ca4b93ac00d58fc5cc06184",
        "templateCredsSetupCompleted": true
      },
      "nodes": [
        {
          "id": "b8820690-4362-4f93-a570-adbfe256f138",
          "name": "Loop Over Items",
          "type": "n8n-nodes-base.splitInBatches",
          "position": [
            416,
            -208
          ],
          "parameters": {
            "options": {
              "reset": false
            }
          },
          "typeVersion": 3
        },
        {
          "id": "8c58ada5-cd57-4d8c-8b91-c2a832ac5fff",
          "name": "gather data",
          "type": "n8n-nodes-base.code",
          "position": [
            640,
            -336
          ],
          "parameters": {
            "jsCode": "// Loop over input items and add a new field called 'myNewField' to the JSON of each one\nconst results = [];\n\nfor (const item of $input.all()) {\n  console.log(item, \"item ioio\");\n  item.json.items.forEach((it) => {\n    const { title, htmlTitle, snippet, htmlSnippet } = it;\n    results.push({\n      title,\n      snippet,\n    });\n  });\n}\nreturn [{ json: {results} }];\n"
          },
          "typeVersion": 2
        },
        {
          "id": "33af828f-9f5e-4b2a-ba06-e3974d682b8d",
          "name": "Basic LLM Chain",
          "type": "@n8n/n8n-nodes-langchain.chainLlm",
          "position": [
            960,
            -336
          ],
          "parameters": {
            "text": "=## **Your Optimized Prompt:**\n\n**Role:**\nYou are a **data extraction and transformation expert**. Your task is to analyze JSON data that may contain scattered or nested information about individuals or businesses, and extract **unique** entries containing available contact details.\n\n**Goal:**\nRead the provided JSON data and output an array of objects containing the following fields (whatever is available, if not availale keep that field empty):\n\n* `name`\n* `email`\n* `phone`\n* `address`\n\nEach extracted entry should be unique and formatted consistently. If any of the field is not available keep it empty, if email is not available discard the entry\n\n---\n\n**Input:**\n\n```\n{{ JSON.stringify($json.results) }}\n```\n\nThis data may include fields like `contact`, `email`, `phone`, `address`, or may embed this info within text or nested structures.\n\n---\n\n**Output Format:**\n\n```json\n[\n  {\n    \"name\": \"First Last\",\n    \"email\": \"abc@example.com\",\n    \"phone\": \"1234567890\",\n    \"address\": \"123 Example Street, City, State\"\n  }\n]\n```\n\n* Each object represents **one unique contact**.\n* Deduplicate entries using `email` or `phone` as primary identifiers.\n* Preserve text formatting (don’t alter case or add punctuation not in source).\n\n---\n\n**Instructions:**\n\n1. Parse the input JSON carefully — handle nested objects or arrays.\n2. Search for any fields or text patterns that represent **names, emails, phone numbers, or addresses**.\n3. Normalize and clean values (trim spaces, remove duplicates).\n4. Skip incomplete or invalid contact records.\n5. Output the final cleaned list of unique contacts in the JSON format exactly as shown above.",
            "batching": {},
            "promptType": "define",
            "hasOutputParser": true
          },
          "typeVersion": 1.7
        },
        {
          "id": "ce20ec1a-6e71-42c8-a75a-671feca0cb2c",
          "name": "OpenAI Chat Model",
          "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
          "position": [
            960,
            64
          ],
          "parameters": {
            "model": {
              "__rl": true,
              "mode": "list",
              "value": "gpt-4.1-mini"
            },
            "options": {}
          },
          "credentials": {
            "openAiApi": {
              "id": "ZerNZoHwDCmguUeo",
              "name": "OpenAi account"
            }
          },
          "typeVersion": 1.2
        },
        {
          "id": "3583443b-4c26-41ba-9d90-6900165d62e4",
          "name": "Structured Output Parser",
          "type": "@n8n/n8n-nodes-langchain.outputParserStructured",
          "position": [
            1024,
            -112
          ],
          "parameters": {
            "autoFix": true,
            "jsonSchemaExample": "[\n  {\"name\": \"John Doe\", \"email\": \"John@example.com\", \"phone\": \"+1 555-1234\"},\n  {\"name\": \"Jane\", \"email\": \"Jane@example.com\", \"address\": \"123 Main St, NY\"},\n  {\"name\": \"Sam\", \"email\": \"sam@mail.com\", \"phone\": \"555-9876\"}\n]"
          },
          "typeVersion": 1.3
        },
        {
          "id": "9aa90dec-ce86-43c0-8044-fbf6126e7f9e",
          "name": "Append row in sheet",
          "type": "n8n-nodes-base.googleSheets",
          "position": [
            1824,
            -320
          ],
          "parameters": {
            "columns": {
              "value": {
                "name": "={{ $json.name }}",
                "email": "={{ $json.email }}",
                "niche": "={{ $('Input').first().json.niche }} ",
                "phone": "={{ $json.phone }}",
                "address": "={{ $json.address }}"
              },
              "schema": [
                {
                  "id": "name",
                  "type": "string",
                  "display": true,
                  "required": false,
                  "displayName": "name",
                  "defaultMatch": false,
                  "canBeUsedToMatch": true
                },
                {
                  "id": "email",
                  "type": "string",
                  "display": true,
                  "required": false,
                  "displayName": "email",
                  "defaultMatch": false,
                  "canBeUsedToMatch": true
                },
                {
                  "id": "phone",
                  "type": "string",
                  "display": true,
                  "required": false,
                  "displayName": "phone",
                  "defaultMatch": false,
                  "canBeUsedToMatch": true
                },
                {
                  "id": "address",
                  "type": "string",
                  "display": true,
                  "required": false,
                  "displayName": "address",
                  "defaultMatch": false,
                  "canBeUsedToMatch": true
                },
                {
                  "id": "niche",
                  "type": "string",
                  "display": true,
                  "removed": false,
                  "required": false,
                  "displayName": "niche",
                  "defaultMatch": false,
                  "canBeUsedToMatch": true
                }
              ],
              "mappingMode": "defineBelow",
              "matchingColumns": [],
              "attemptToConvertTypes": false,
              "convertFieldsToString": false
            },
            "options": {},
            "operation": "append",
            "sheetName": {
              "__rl": true,
              "mode": "list",
              "value": "gid=0",
              "cachedResultUrl": "https://docs.google.com/spreadsheets/d/1HNW1qabyaSd6S5AbLMyb1_6jt57Mp6SPY2qVIhi49JQ/edit#gid=0",
              "cachedResultName": "Sheet1"
            },
            "documentId": {
              "__rl": true,
              "mode": "list",
              "value": "1HNW1qabyaSd6S5AbLMyb1_6jt57Mp6SPY2qVIhi49JQ",
              "cachedResultUrl": "https://docs.google.com/spreadsheets/d/1HNW1qabyaSd6S5AbLMyb1_6jt57Mp6SPY2qVIhi49JQ/edit?usp=drivesdk",
              "cachedResultName": "leads"
            }
          },
          "credentials": {
            "googleSheetsOAuth2Api": {
              "id": "merA7vHlm9MkcRB7",
              "name": "Google Sheets account 2"
            }
          },
          "typeVersion": 4.7
        },
        {
          "id": "0db8a0a5-0fce-4c6f-863b-cbf42ef935bc",
          "name": "Loop Over Items1",
          "type": "n8n-nodes-base.splitInBatches",
          "position": [
            1552,
            -336
          ],
          "parameters": {
            "options": {}
          },
          "typeVersion": 3
        },
        {
          "id": "ea61af11-a8d7-48f7-8c94-78d79fc1a5a7",
          "name": "Split Out",
          "type": "n8n-nodes-base.splitOut",
          "position": [
            1312,
            -336
          ],
          "parameters": {
            "options": {},
            "fieldToSplitOut": "output"
          },
          "typeVersion": 1
        },
        {
          "id": "879d6c01-0b82-44b5-925e-1303213900f2",
          "name": "Input",
          "type": "n8n-nodes-base.set",
          "position": [
            -32,
            -192
          ],
          "parameters": {
            "mode": "raw",
            "options": {},
            "jsonOutput": "{\n  \"niche\": \"dentist in LA\",\n  \"pages\": 2\n}\n"
          },
          "typeVersion": 3.4
        },
        {
          "id": "36403b59-17ac-4c03-8601-c4e7d422e1e7",
          "name": "Set Pages",
          "type": "n8n-nodes-base.code",
          "position": [
            192,
            -192
          ],
          "parameters": {
            "jsCode": "const pages = $input.first().json.pages;\n\nconst startIndexes = [];\n\nfor (let i = 0; i < pages; i++) {\n  startIndexes.push((10 * i) + 1);\n}\n\n// Return array of items with json property for n8n loop node\nreturn startIndexes.map(startIndex => ({\n  json: {\n    startIndex: startIndex,\n  }\n}));"
          },
          "typeVersion": 2
        },
        {
          "id": "01bd5c7e-4e01-492a-a542-cc473aacfe7a",
          "name": "Google Search",
          "type": "n8n-nodes-base.httpRequest",
          "position": [
            640,
            -144
          ],
          "parameters": {
            "url": "https://www.googleapis.com/customsearch/v1",
            "options": {},
            "sendQuery": true,
            "queryParameters": {
              "parameters": [
                {
                  "name": "q",
                  "value": "={{ $('Input').item.json.niche }}"
                },
                {
                  "name": "lr",
                  "value": "lang_en"
                },
                {
                  "name": "cr",
                  "value": "countryUS"
                },
                {
                  "name": "dateRestrict",
                  "value": "y[1]"
                },
                {
                  "name": "key",
                  "value": "<YOUR_API_KEY>"
                },
                {
                  "name": "cx",
                  "value": "<YOUR_CUSTOM_SEARCH_ID>"
                },
                {
                  "name": "start",
                  "value": "={{ $json.startIndex }}"
                },
                {
                  "name": "exactTerms",
                  "value": "@gmail.com"
                }
              ]
            }
          },
          "typeVersion": 4.2
        },
        {
          "id": "53eaaeea-8c87-4ff3-8cb9-0bc804a116c6",
          "name": "Sticky Note",
          "type": "n8n-nodes-base.stickyNote",
          "position": [
            -352,
            -528
          ],
          "parameters": {
            "color": 7,
            "width": 656,
            "height": 704,
            "content": "## Setup Context\n\nSet the niche, and number of pages to search on google"
          },
          "typeVersion": 1
        },
        {
          "id": "a4d3ebb1-1b7b-4fb8-b30c-ffa5c4082eb2",
          "name": "Sticky Note1",
          "type": "n8n-nodes-base.stickyNote",
          "position": [
            336,
            -608
          ],
          "parameters": {
            "color": 3,
            "width": 512,
            "height": 784,
            "content": "## Google Search\n\nSearch and gather data from different pages on google\n\n* Replace these query parameters with your own keys\n\n1. key - <YOUR_API_KEY>\n2. cx - <YOUR_CUSTOM_SEARCH_ID>\n\nWatch this video to know how to setup these values\n> https://youtu.be/9f0PzEiaIgM?si=_tZunyouzVDTEFkc&t=262\n"
          },
          "typeVersion": 1
        },
        {
          "id": "814c31fd-bf1d-4be1-9405-1662c8b8fe87",
          "name": "Sticky Note2",
          "type": "n8n-nodes-base.stickyNote",
          "position": [
            880,
            -528
          ],
          "parameters": {
            "color": 5,
            "width": 576,
            "height": 704,
            "content": "## Parse the gathered data into readable format\n\nNow we will parsed the incoming data from google that is ready to be put into a google sheet"
          },
          "typeVersion": 1
        },
        {
          "id": "b30ae33a-8c48-41f8-b6f8-4506ab2a6ced",
          "name": "Sticky Note3",
          "type": "n8n-nodes-base.stickyNote",
          "position": [
            1488,
            -528
          ],
          "parameters": {
            "color": 4,
            "width": 512,
            "height": 704,
            "content": "## Add Leads to the sheet\n\nAdd Name, email and other available data into a google sheet"
          },
          "typeVersion": 1
        }
      ],
      "pinData": {},
      "connections": {
        "Input": {
          "main": [
            [
              {
                "node": "Set Pages",
                "type": "main",
                "index": 0
              }
            ]
          ]
        },
        "Set Pages": {
          "main": [
            [
              {
                "node": "Loop Over Items",
                "type": "main",
                "index": 0
              }
            ]
          ]
        },
        "Split Out": {
          "main": [
            [
              {
                "node": "Loop Over Items1",
                "type": "main",
                "index": 0
              }
            ]
          ]
        },
        "gather data": {
          "main": [
            [
              {
                "node": "Basic LLM Chain",
                "type": "main",
                "index": 0
              }
            ]
          ]
        },
        "Google Search": {
          "main": [
            [
              {
                "node": "Loop Over Items",
                "type": "main",
                "index": 0
              }
            ]
          ]
        },
        "Basic LLM Chain": {
          "main": [
            [
              {
                "node": "Split Out",
                "type": "main",
                "index": 0
              }
            ]
          ]
        },
        "Loop Over Items": {
          "main": [
            [
              {
                "node": "gather data",
                "type": "main",
                "index": 0
              }
            ],
            [
              {
                "node": "Google Search",
                "type": "main",
                "index": 0
              }
            ]
          ]
        },
        "Loop Over Items1": {
          "main": [
            [],
            [
              {
                "node": "Append row in sheet",
                "type": "main",
                "index": 0
              }
            ]
          ]
        },
        "OpenAI Chat Model": {
          "ai_languageModel": [
            [
              {
                "node": "Basic LLM Chain",
                "type": "ai_languageModel",
                "index": 0
              },
              {
                "node": "Structured Output Parser",
                "type": "ai_languageModel",
                "index": 0
              }
            ]
          ]
        },
        "Append row in sheet": {
          "main": [
            [
              {
                "node": "Loop Over Items1",
                "type": "main",
                "index": 0
              }
            ]
          ]
        },
        "Structured Output Parser": {
          "ai_outputParser": [
            [
              {
                "node": "Basic LLM Chain",
                "type": "ai_outputParser",
                "index": 0
              }
            ]
          ]
        }
      }
    }