Pluto Automation Script Developer Guide
This guide is for developers building custom automation scripts using Playwright in Pluto’s secure environment. We’ll walk through two working examples (a pseudo-bank demo and a Reddit karma checker) and cover best practices for robust, error-proof automations. Topics include handling 2FA/OTP, adapting to different login flows, scraping data reliably, debugging techniques, and tips for dealing with issues like layout changes or CAPTCHAs.
Introduction to Pluto and Embedded Automations
- The most basic automation structure:
import { createSession } from "@plutoxyz/automation";
import { chromium } from "playwright-core";
// Create a new Pluto session (secure browser context)
const session = await createSession();
// ... your automation steps ...
await session.close(); // make sure to close the session at the end
Every script begins by creating a session, which handles secure browser launch and user prompts, and ends by closing the session. Within the script, you’ll navigate pages, fill forms, and use session.prove(…) to output any data you want to attest.
When run inside Pluto’s environment (e.g. via Pluto Frame in your app), the script will execute in a secure browser. For local testing and development, you can also run it on your machine (more on this in the debugging section).
With the basics in place, let’s dive into a full example.
Example 1: Pseudo-Bank Login and Balance Proof
Our first example is a simple script that logs into a fake bank website and retrieves the account balance. This will illustrate the fundamental structure of a Pluto automation script: prompting for user input, performing actions with Playwright, waiting for content, and proving the result.
Script Overview
The script will prompt the user for a username and password, navigate to the bank’s login page, sign in, wait for the accounts overview to load, scrape the balance, and then output (prove) the balance.
Pseudo-Bank Script Code
import { createSession } from "@plutoxyz/automation";
import { chromium } from "playwright-core";
const session = await createSession();
/**
* Prompt credentials from the user
*/
const [username, password] = await session.prompt({
title: "Login to Pseudo Bank",
description: "Enter your username and password to login to Pseudo Bank",
prompts: [
{ label: "Username", type: "text", attributes: {} },
{ label: "Password", type: "password", attributes: {} },
],
});
/**
* Launch browser and open a new page
*/
const browser = await chromium.connectOverCDP(await session.cdp());
const context = browser.contexts()[0];
const page = context.pages()[0];
/**
* Navigate to the login page and enter credentials
*/
await page.goto("https://pseudo-bank.pluto.dev");
await page.getByRole("textbox", { name: "Username" }).fill(username);
await page.getByRole("textbox", { name: "Password" }).fill(password);
await page.getByRole("button", { name: "Login" }).click();
await page.waitForSelector("text=Your Accounts", { timeout: 5000 });
/**
* Scrape the account balance from the page
*/
const balanceLocator = page.locator("#balance-2");
await balanceLocator.waitFor({ state: "visible", timeout: 5000 });
const balanceText = (await balanceLocator.textContent()) || "";
const balance = parseFloat(balanceText.replace(/[$,]/g, ""));
/**
* Prove the bank balance (output the result)
*/
await session.prove("bank_balance", balance);
// Cleanup
await page.close();
await browser.close();
await session.close();
Explanation
-
Prompting the User: The script uses
session.prompt(...)
to ask the user for their Username and Password. This generates a secure prompt (via Pluto Frame) on the user’s side, and returns the entered values. By not hardcoding credentials and using Pluto’s prompt, we ensure sensitive info is never exposed in logs or code. -
Launching the Browser: We connect to a Chromium browser via
chromium.connectOverCDP(await session.cdp())
. Under the hood,createSession()
has started a browser in a secure enclave. Using connectOverCDP attaches our script to that browser’s devtools protocol. We retrieve the browser context and page. -
Navigating and Logging In:
page.goto('https://pseudo-bank.pluto.dev')
opens the login page. We then fill the login form using Playwright locators:page.getByRole('textbox', { name: 'Username' })
finds the username input field by its accessible role and label text. This is more robust than querying by CSS selectors or placeholder, because it relies on screen-reader labels which usually remain stable. We call.fill(username)
to type the provided username.- Similarly, fill the password field and click the Login button (found by its role and name).
- After clicking login, we wait for a known element or text that indicates a successful login. Here
page.waitForSelector('text=Your Accounts')
waits until the text “Your Accounts” appears on the page (with a timeout of 5 seconds). This ensures the next step runs only after the accounts overview is loaded. Always use waits after navigation or actions that trigger page loads — it’s critical for script reliability.
-
Scraping Data: Once logged in, the script needs to extract the account balance. We use a CSS selector
#balance-2
(an element with id “balance-2”) which we know holds the balance. We do:const balanceLocator = page.locator('#balance-2');
to create a locator for that element.await balanceLocator.waitFor({ state: 'visible' })
to ensure the element is present and visible.const balanceText = await balanceLocator.textContent()
to grab the text content (e.g. “$8,910.30”). We then parse it by stripping out currency symbols and commas and converting to a float.- This approach is simple for a controlled example. In real scenarios, you might need to trim whitespace or handle different formats. The key is to reliably locate the element containing the data.
-
Proving the Result: Finally, we call
session.prove('bank_balance', balance)
. This reports the balance value under the label ‘bank_balance’ to Pluto’s attestation system. -
Cleanup: We close the page, browser, and session. Always close the session at the end of your script to release resources.
This pseudo-bank example demonstrates the basic pattern for automation scripts. Next, we’ll examine a more complex real-world example with additional challenges.
Types of Prompts available
TODO: Add examples of each prompt type
Example 2: Reddit Karma Collector
Our second example is an automation that logs into Reddit and proves the user’s total karma.
- Reddit allows login either with username/password or via Google OAuth – our script supports both.
- Reddit accounts may have 2FA enabled – our script detects and handles OTP entry.
- Logging in via Google may trigger passkey – our script deals with this by simulating a security key.
- After login, Reddit’s UI (new design) requires navigating a menu to reach the profile page and then scraping the karma values.
- We must account for possibly different DOM structures (e.g. if user is on the new Reddit design vs old design, or mobile vs desktop). Our example assumes the new Reddit desktop layout. We’ll discuss how layout differences can affect selectors and how to handle that.
Let’s break down the Reddit script into parts for clarity.
a. Prompting for Login Method
Reddit offers multiple login methods. We start by asking the user how they want to log in – either with “Email or Username” (i.e. Reddit credentials) or “Google” (using Google OAuth). This determines which flow to follow.
const AUTH_TYPES = {
USERNAME: "Email or Username",
GOOGLE: "Google",
};
// Prompt the user to choose login method
const [authType] = await session.prompt({
title: "Log in",
description: "How do you want to log in?",
prompts: [
{
label: "Login Type",
type: "checkbox",
attributes: {
multiple: false,
options: Object.values(AUTH_TYPES), // ['Email or Username', 'Google']
},
},
],
});
console.log(`User chose ${authType[0]}`);
The user will then select one of the options in the Pluto Frame UI.
With the choice in hand, we create a browser session similar to before:
const session = await createSession();
const browser = await chromium.connectOverCDP(await session.cdp());
const context = browser.contexts()[0];
const page = context.pages()[0];
Now we can proceed with the login. We will handle two flows:
- Standard Reddit Login (username/password, possibly with Reddit 2FA).
- Google OAuth Login (which involves a popup and Google’s flow, possibly with Google 2FA).
But first, one common piece: Reddit sometimes might immediately present a 2FA check if it detected a suspicious login attempt, even before credentials (this is rare, usually 2FA comes after password). The script includes a helper handleSecurityDetection()
for Reddit’s own 2FA after login. We’ll call this function at appropriate points.
b. Handling Reddit’s 2FA (One-Time Password)
Reddit’s two-factor authentication page has a textbox for a verification code and a “Check code” button. To handle this if it appears, our script uses a detection function:
// Detect if Reddit is asking for a 2FA code and handle it
const handleSecurityDetection = async () => {
// Check if the 2FA verification code input is present (and visible)
const security = !!(
await page
.getByRole("textbox", { name: "Verification code" })
.elementHandles()
).length;
if (security) {
console.log(`Reddit 2FA detected, prompting for security code`);
const otp = await session.prompt({
title: "Enter one time code",
description: "Enter the one time code sent to your phone",
prompts: [{ label: "Code", type: "text", attributes: { length: 6 } }],
});
console.log(`Filling security code`);
await page.getByRole("textbox", { name: "Verification code" }).fill(otp[0]);
await page.getByRole("button", { name: "Check code" }).click();
await page.waitForLoadState("domcontentloaded");
console.log(`Security code submitted`);
} else {
console.log(`No security code detected, continuing`);
}
};
What this does: It looks for a textbox with accessible name “Verification code”. If found, it assumes the Reddit 2FA page is present. It then prompts the user for the 6-digit code (e.g., from their authenticator app or SMS), fills it in, and clicks “Check code”, then waits for the page to load after submission. If no such textbox is found, it logs that no 2FA was needed. We will invoke handleSecurityDetection()
after attempting login, to cover the case where Reddit requires a code.
Why check .elementHandles().length
instead of isVisible()
? In our testing, we’ve found isVisible()
to be somewhat unreliable. If we only need to check for the presence of an element, .elementHandles().length
will suffice. isVisible()
is only required if we need to ensure the element is actually visible to the user. In addition, we use elementHandles().length
over elementHandle()
because the latter will throw an error if the element is not found, which we don’t want.
c. Standard Login (Username & Password)
Now, the function for the normal Reddit login using username/password:
const authUsernamePass = async (usernameError, passwordError) => {
console.log(`Prompting user for username/password`);
const [username, password] = await session.prompt({
title: "Login to Reddit",
description: "Enter your Reddit credentials to prove your data",
prompts: [
{
label: "Email or username",
type: "text",
attributes: { placeholder: "Email or username", min_length: 3 },
error: usernameError,
},
{
label: "Password",
type: "password",
attributes: { placeholder: "Password" },
error: passwordError,
},
],
});
console.log(`Filling username`);
await page.getByRole("textbox", { name: "Email or username" }).fill(username);
console.log(`Filling password`);
await page.getByRole("textbox", { name: "Password" }).fill(password);
await page.getByRole("button", { name: "Log In" }).click();
await page.waitForTimeout(2000); // small wait for page to respond
// Check if a 2FA code input became visible (Reddit 2FA)
const has2FA = await page
.getByRole("textbox", { name: "Verification code" })
.isVisible();
if (has2FA) {
console.log(`Detected 2-Step Verification (OTP) for Reddit login`);
const [code] = await session.prompt({
title: "Enter one time code",
description: "Enter the one time code from your authenticator app",
prompts: [{ label: "Code", type: "text", attributes: { length: 6 } }],
});
await page.getByRole("textbox", { name: "Verification code" }).fill(code);
await page.getByRole("button", { name: "Check code" }).click();
console.log(`2FA code submitted to Reddit`);
} else {
console.log(`No 2FA required for Reddit login`);
}
};
Key points in this authUsernamePass
flow:
-
We prompt the user for their Reddit credentials. Notice we include optional
error
fields for the prompts. If the login fails (e.g., wrong password), we could callauthUsernamePass
again and supply an error message to display to the user. -
We fill the login form using
getByRole
locators similar to the bank example. -
After clicking “Log In”, we wait 2 seconds with
waitForTimeout(2000)
. This is a simple way to give the page time to either navigate or reveal an error. A more robust approach might beawait page.waitForURL()
to detect if we moved off the login page, or wait for some element that only appears on success or failure. The short wait helps ensure the next check doesn’t execute too early. -
We then check for Reddit’s OTP prompt using
.isVisible()
on the “Verification code” field. Here we useisVisible()
instead of checking existence because (unlike earlier) Reddit’s login page always includes the 2FA field in the DOM (hidden with CSS if 2FA is not needed). So we must check visibility to know if it’s actually being asked for. If 2FA is required, we prompt the user for their authenticator code, fill it, and click “Check code”. -
If no 2FA, we simply continue. At this point, the user is logged in (or if credentials were wrong, technically still on login page — our script does not yet handle that scenario with a retry, which could be an enhancement).
d. Google OAuth Login (with Passkey and 2FA handling)
If the user chose to log in with Google, the flow is more complex because it involves a popup window and potentially various Google security steps. The script’s authGoogle
function manages this:
const authGoogle = async () => {
await page.getByRole("button", { name: "Continue with Google" }).click();
console.log(`Opening Google login popup`);
const googlePage = await page.waitForEvent("popup");
console.log(`Google login popup opened`);
// **Bypass WebAuthn passkey prompts by using a virtual authenticator**
const client = await context.newCDPSession(googlePage);
await client.send("WebAuthn.enable");
await client.send("WebAuthn.addVirtualAuthenticator", {
options: { protocol: "ctap2", transport: "internal" },
});
console.log(`Virtual authenticator added to disable passkey prompts`);
// Prompt user for their Google account credentials
const [email, password] = await session.prompt({
title: "Google Email or phone",
description: "Enter your Google credentials to login",
prompts: [
{
label: "Email or phone",
type: "text",
attributes: { placeholder: "Email or phone", min_length: 3 },
},
{
label: "Password",
type: "password",
attributes: { placeholder: "Password" },
},
],
});
console.log(`Filling Google email`);
await googlePage.waitForLoadState("domcontentloaded");
await googlePage.getByRole("textbox", { name: "Email or phone" }).fill(email);
await googlePage.getByText("Next").click();
await googlePage.waitForNavigation();
await googlePage.waitForTimeout(1000); // wait for password field to appear
// If "Try another way" is present, the account might have passkey enabled
await googlePage
.getByRole("button", { name: "Try another way" })
.click()
.catch(() => {});
console.log(`Filling Google password`);
await googlePage
.getByRole("textbox", { name: "Enter your password" })
.fill(password);
await googlePage.getByText("Next").click();
await googlePage.waitForNavigation({ waitUntil: "networkidle" });
await googlePage.waitForTimeout(3000);
// Check if Google asks for 2-Step Verification
const has2Step = !!(
await googlePage.getByText("2-Step Verification").elementHandles()
).length;
if (has2Step) {
console.log(`Google account has 2-Step Verification enabled`);
// Google might present multiple 2FA options (app prompt, authenticator code, SMS)
// We will detect available options and prompt user to choose one.
// Optional screen: sometimes Google shows a "use your device / try another way" interim step
const optionalPrompt = await googlePage.getByText(
"fingerprint, face, or screen lock"
);
if ((await optionalPrompt.elementHandles()).length) {
console.log(
`Google presented an extra verification option screen, skipping it`
);
await googlePage.getByRole("button", { name: "Try another way" }).click();
await googlePage.waitForTimeout(1000);
}
// Define possible 2FA options
const options = [
{
label: "Tap Yes on your phone or tablet",
locator: googlePage.getByRole("link", {
name: " on your phone or tablet",
}),
followup: async () => {
// This option is just a prompt on user's device; we'll ask user to confirm they did it
},
prompt: async () => ({
title: "Confirm Access",
description:
'Please tap "Yes" on your phone to allow sign-in, then confirm here to continue.',
prompts: [
{
label: "Confirm to continue",
type: "checkbox",
attributes: { options: ["Confirmed"], required: true },
},
],
}),
},
{
label: "Get a verification code from the Google Authenticator app",
locator: googlePage.getByRole("link", {
name: "Get a verification code from",
}),
followup: async (_page, _code) => {
// This will execute after user provides code
await _page.getByRole("textbox", { name: "Enter code" }).fill(_code);
await _page.getByText("Next").click();
},
prompt: () => ({
title: "Enter one time code",
description: "Enter the code from your Google Authenticator app",
prompts: [{ label: "Code", type: "text", attributes: { length: 6 } }],
}),
},
{
label: "Get a verification code sent to your phone",
locator: googlePage.getByRole("link", {
name: "Get a verification code at",
}),
followup: async (_page, _code) => {
await _page
.getByRole("textbox", { name: "Enter the code" })
.fill(_code);
await _page.getByText("Next").click();
},
prompt: () => ({
title: "Enter one time code",
description: "Enter the one time code sent to your phone via SMS",
prompts: [{ label: "Code", type: "text", attributes: { length: 6 } }],
}),
},
];
// Filter which options are actually enabled on this account
const enabledOptions = [];
for (const opt of options) {
if (
(await opt.locator.elementHandles()).length &&
!(await opt.locator.isDisabled())
) {
enabledOptions.push(opt);
}
}
if (enabledOptions.length === 0) {
console.warn(`No 2FA options available or enabled, this is unexpected.`);
} else if (enabledOptions.length === 1) {
// Only one option available, proceed with it
console.log(`Single 2FA option detected: ${enabledOptions[0].label}`);
await enabledOptions[0].locator.click();
console.log(`Prompting user for code (if needed)`);
const [code] = await session.prompt(await enabledOptions[0].prompt());
if (enabledOptions[0].followup) {
await enabledOptions[0].followup(googlePage, code);
console.log(`2FA code entered for Google`);
}
} else {
// Multiple options, ask user to choose
console.log(
`Multiple 2FA options detected for Google: ${enabledOptions
.map((o) => o.label)
.join(", ")}`
);
const [choice] = await session.prompt({
title: "Two Factor Authentication",
description: "Choose a 2FA method to continue Google login",
prompts: [
{
label: "2FA Method",
type: "checkbox",
attributes: {
multiple: false,
options: enabledOptions.map((o) => o.label),
},
},
],
});
const selected = enabledOptions.find((o) => o.label === choice[0]);
await selected.locator.click();
console.log(`User chose 2FA method: ${selected.label}`);
const [code] = await session.prompt(await selected.prompt());
if (selected.followup) {
await selected.followup(googlePage, code);
console.log(`2FA step completed for Google`);
}
}
} else {
console.log(`Google account did not require 2-Step Verification`);
}
};
This is the most involved part of the script, so let’s unpack it:
-
We click the “Continue with Google” button on Reddit’s login page. Playwright captures the popup that opens (
await page.waitForEvent('popup')
) asgooglePage
. Now we have a separategooglePage
context to automate the Google login. -
Bypassing Passkey Prompts: New Google sign-ins may prompt the user to use a passkey or a device (e.g., on Android you might get “Use your screen lock or fingerprint”). Since our automation cannot complete those, we preemptively add a virtual authenticator via Chrome DevTools Protocol:
- We obtain a CDP session for the
googlePage
(withcontext.newCDPSession(googlePage)
). - Enable the WebAuthn domain and add a virtual authenticator (
WebAuthn.addVirtualAuthenticator
) with CTAP2 protocol. This effectively registers a fake security key with the page that auto-accepts any passkey request. This means if Google tries to initiate a passkey login, it will get an immediate dummy response (preventing a deadlock waiting for user’s biometric action). - We log that we added the virtual authenticator to inform our debugging.
This trick is crucial if the user has passkeys enabled on their Google account. Without it, the script would hang at a passkey prompt.
- We obtain a CDP session for the
-
Google account credentials: We prompt the user for their Google email and password. Then:
- Wait for the Google login page to load, fill the email/phone field, click Next.
- After navigation, wait a moment for the password field and possibly a “Try another way” option. We call
.click()
on “Try another way” and ignore errors if it’s not present. This step is there to bypass any lingering prompt (for instance, if Google initially wants to use a stored credential or device, clicking “Try another way” forces it to ask for password). - Fill the password and click Next, then wait for navigation to complete (using
networkidle
state to ensure all network calls settle). At this point, if credentials are correct, either the Google login is done (if no 2FA) or Google will proceed to a 2-Step Verification step.
-
Handling Google’s 2-Step Verification: We check if the text “2-Step Verification” is present. If yes, the Google account has 2FA. Google might offer multiple methods (like prompt, authenticator code, SMS code, backup codes, etc.). Our script:
-
Checks for an intermediate prompt like “use your fingerprint, face, or screen lock” – if found, it clicks “Try another way” to skip that, because we can’t handle device biometrics in automation.
-
Defines possible 2FA options we know how to handle:
- “Tap Yes on your phone or tablet” – this is the default push notification Google prompt. Our approach is to have the user tap yes on their device and then confirm in our UI they did it. We can’t automate the phone tap, obviously, so we rely on the user. The prompt for this option asks the user to confirm once they’ve approved the login on their device.
- Authenticator App code – Google Authenticator or similar. The script will prompt the user for the 6-digit code, then fill it in and click Next.
- SMS code – similar to authenticator, but code sent via SMS.
(You could extend this with backup codes or other methods if needed.)
-
It then checks which of these options are actually visible and enabled on the page. For example, if the user has an authenticator app configured, that option’s link will appear. If not, it might be absent or disabled. We gather the available options into
enabledOptions
. -
If no known options are available (which would be unusual), we log a warning. If exactly one option is available, we proceed with it automatically. If multiple are available, we prompt the user to pick one (similar to how we prompted for login method).
-
After the user either confirms or provides a code depending on the method, we run the appropriate follow-up: e.g., fill the code into the page and submit. This completes the Google login flow.
-
This Google login handling illustrates how to script around unpredictable security flows by:
- detecting what is presented,
- offering the user choices or prompts as needed,
- and leveraging browser capabilities (like CDP for WebAuthn) to handle things programmatically where possible.
Tip: Virtual Authenticators for Passkeys: The use of a virtual authenticator to bypass passkey or WebAuthn prompts can be reused for other sites that use WebAuthn (passkeys or security keys). It essentially automates a hardware key. In Playwright, you can enable it via CDP as shown. This is an advanced technique, but extremely useful when automating logins for accounts that have passkeys enabled. Without it, the script could be blocked by a modal waiting for user interaction on the physical device.
e. Running the Chosen Login Flow
After defining authUsernamePass
and authGoogle
, and the handleSecurityDetection
helper, our script triggers the correct flow based on the user’s choice:
// Navigate to Reddit login page
const LOGIN_URL = "https://www.reddit.com/login/";
await page.goto(LOGIN_URL, { waitUntil: "domcontentloaded" });
console.log(`${LOGIN_URL} loaded`);
// Handle any immediate security page (unlikely on initial load)
await handleSecurityDetection();
// Execute the chosen auth flow
switch (authType[0]) {
case AUTH_TYPES.GOOGLE:
console.log(`Proceeding with Google OAuth flow`);
await authGoogle();
// Wait for Reddit to possibly redirect after Google login
await page.waitForLoadState("domcontentloaded");
// If Reddit now shows its 2FA (uncommon after Google login, but just in case)
await handleSecurityDetection();
break;
case AUTH_TYPES.USERNAME:
default:
console.log(`Proceeding with username/password flow`);
await authUsernamePass();
// (Reddit 2FA handled inside authUsernamePass)
break;
}
console.log(`User logged in to Reddit`);
We load the Reddit login page and immediately call handleSecurityDetection()
in case Reddit threw a security check upfront (generally it won’t, but it’s there for completeness). Then:
-
If the user chose Google, we call
authGoogle()
. After it returns, the Google popup has closed and the main page should be logged in to Reddit (Google handled authentication and Reddit redirected back). We wait for the content to load and callhandleSecurityDetection()
once more in case Reddit still asks for its own 2FA after the OAuth (usually not). -
If the user chose username, we call
authUsernamePass()
, which handles Reddit login (and calls its own 2FA prompt if needed). -
After either flow, we consider the user logged in.
At this point, if everything succeeded, we have an authenticated Reddit session on page
. If credentials were wrong or a step failed, the script might still be on the login page or an error state. In a production script, you’d want to detect and handle failures (e.g., by checking the current URL or page content to ensure login success and maybe re-prompt or abort with an error message). For brevity, our example assumes success for the demonstration.
f. Navigating Reddit and Scraping Karma
Now that we’re logged in, the next steps are to go to the user’s profile page and scrape the karma values.
// Navigate to the user's profile page
console.log(`Navigating to User Profile`);
await page.getByRole("button", { name: /expand user menu$/i }).click();
await page.getByText("View Profile").click();
// Wait for profile page to load
await page.waitForLoadState("domcontentloaded");
console.log(`Profile page loaded`);
// Scrape the Reddit username and karma
console.log(`Scraping user data`);
const redditUsername = await page
.getByLabel("Profile information")
.getByRole("heading")
.first()
.innerText();
const [postKarma, commentKarma] = await page
.getByTestId("karma-number")
.allInnerTexts();
// Calculate total karma
const totalKarma = (
Number(postKarma.replace(",", "")) + Number(commentKarma.replace(",", ""))
).toLocaleString("en-US");
Explanation:
-
Navigating to profile: We click on the user menu. On new Reddit’s interface, the user menu is opened by clicking the user avatar button. The code uses a regex in the locator name (
/expand user menu$/i
) to find the button whose name ends with “Expand user menu” (case-insensitive). This is an example of using regex ingetByRole
to handle dynamic parts (the username might be part of the button label in some cases, hence just matching the end).Note: If the account is brand new or has no avatar, the label might differ, but generally this works for Reddit’s UI. Alternatively, one could directly go to
reddit.com/user/<username>
if the username is known, but we obtain the username only after login. Clicking the UI ensures we follow the intended flow and end up in the right place. -
Scraping username: Reddit’s profile page (new design) has the profile information section containing the display name. We use a chained locator:
page.getByLabel('Profile information').getByRole('heading').first()
. This finds the element labeled “Profile information” (which is an accessible label on the container of profile details), then within that container finds the first heading. That happens to be the username (the Reddit handle of the user) displayed on the profile. We then get its inner text.This approach is a bit more robust than something like
page.getByText('u/')
because it anchors on a known labeled section. It demonstrates how to scope locators to a certain section of the page for reliability. -
Scraping karma: Reddit shows two karma numbers (Post Karma and Comment Karma) on the profile. In the DOM, they are marked with
data-testid="karma-number"
. We usegetByTestId('karma-number')
which returns both elements. By calling.allInnerTexts()
, we get an array of the text content of each matching element. The order should correspond to post and comment karma. We assign them to[postKarma, commentKarma]
. -
Computing total karma: We parse the two strings to integers (removing commas) and sum them, then format the sum as a localized string (to add comma separators for thousands).
Now we have the data we want: redditUsername
, postKarma
, commentKarma
, and totalKarma
.
g. Proving the Data and Cleanup
Finally, we output the data via session.prove
and close everything:
await session.prove("reddit-data", [
{ redditUsername: redditUsername },
{ totalKarma: totalKarma },
{ postKarma: postKarma },
{ commentKarma: commentKarma },
]);
console.log(`Proof generated, closing session`);
await page.close();
await browser.close();
await session.close();
We call session.prove
with a key ‘reddit-data’ and an array of key-value pairs as the value. This is how you can structure multiple pieces of data in one proof. In the Pluto attestation, this might appear as a JSON array or simply multiple outputs under the ‘reddit-data’ proof. Essentially, we are proving a collection of values:
redditUsername
(the account’s username),totalKarma
(sum of karma),postKarma
,commentKarma
.
After that, we gracefully close the page, browser, and session.
This completes the Reddit example script. It’s lengthy, but it showcases many techniques for robust automation:
- prompting and branching logic for different flows,
- waiting for events (popup, navigation, network idle),
- handling multi-step authentication,
- scraping using reliable selectors,
- and outputting structured results.
📄 View Full Reddit Script Code (for reference)
import { createSession } from "@plutoxyz/automation";
import { chromium } from "playwright-core";
const AUTH_TYPES = {
USERNAME: "Email or Username",
GOOGLE: "Google",
};
const LOGIN_URL = "https://www.reddit.com/login/";
/**
* Detect if we run into a Reddit 2FA page and handle it
*/
const handleSecurityDetection = async () => {
const security = !!(
await page
.getByRole("textbox", { name: "Verification code" })
.elementHandles()
).length;
if (security) {
console.log(`Reddit 2FA detected, prompting for security code`);
const otp = await session.prompt({
title: "Enter one time code",
description: "Enter the one time code sent to your phone",
prompts: [{ label: "Code", type: "text", attributes: { length: 6 } }],
});
console.log(`Filling security code`);
await page.getByRole("textbox", { name: "Verification code" }).fill(otp[0]);
await page.getByRole("button", { name: "Check code" }).click();
await page.waitForLoadState("domcontentloaded");
} else {
console.log(`No security code detected, continuing`);
}
};
const promptAuthType = async () =>
session.prompt({
title: "Log in",
description: "How do you want to log in",
prompts: [
{
label: "Login Type",
type: "checkbox",
attributes: { multiple: false, options: Object.values(AUTH_TYPES) },
},
],
});
const session = await createSession();
console.log(`Prompting for auth type`);
const [authType] = await promptAuthType();
console.log(`User chose ${authType[0]}`);
// Browser Setup
const browser = await chromium.connectOverCDP(await session.cdp());
const context = browser.contexts()[0];
const page = context.pages()[0];
console.log(`Browser opened`);
/**
* Standard Reddit Login (Username/Password) with optional 2FA
*/
const authUsernamePass = async (usernameError, passwordError) => {
console.log(`Prompting user for username/password`);
const [username, password] = await session.prompt({
title: "Login to Reddit",
description: "Enter your Reddit credentials to prove your data",
prompts: [
{
label: "Email or username",
type: "text",
attributes: { placeholder: "Email or username", min_length: 3 },
error: usernameError,
},
{
label: "Password",
type: "password",
attributes: { placeholder: "Password" },
error: passwordError,
},
],
});
console.log(`Filling username`);
await page.getByRole("textbox", { name: "Email or username" }).fill(username);
console.log(`Filling Password`);
await page.getByRole("textbox", { name: "Password" }).fill(password);
await page.getByRole("button", { name: "Log In" }).click();
await page.waitForTimeout(2000);
const has2FA = await page
.getByRole("textbox", { name: "Verification code" })
.isVisible();
if (has2FA) {
console.log(`Detected 2-Step Verification`);
const [code] = await session.prompt({
title: "Enter one time code",
description: "Enter the one time code from your authenticator app",
prompts: [{ label: "Code", type: "text", attributes: { length: 6 } }],
});
await page.getByRole("textbox", { name: "Verification code" }).fill(code);
await page.getByRole("button", { name: "Check code" }).click();
} else {
console.log(`No 2-Step Verification detected, continuing`);
}
};
/**
* Google OAuth Login flow with passkey/2FA handling
*/
const authGoogle = async () => {
await page.getByRole("button", { name: "Continue with Google" }).click();
console.log(`Opening popup`);
const googlePage = await page.waitForEvent("popup");
console.log(`Popup opened`);
// Disable Passkeys by adding a virtual authenticator
const client = await context.newCDPSession(googlePage);
await client.send("WebAuthn.enable");
await client.send("WebAuthn.addVirtualAuthenticator", {
options: { protocol: "ctap2", transport: "internal" },
});
console.log(`Passkeys disabled`);
console.log(`Prompting user for credentials`);
const [email, password] = await session.prompt({
title: "Google Email or phone",
description: "Enter your Google credentials to login",
prompts: [
{
label: "Email or phone",
type: "text",
attributes: { placeholder: "Email or phone", min_length: 3 },
},
{
label: "Password",
type: "password",
attributes: { placeholder: "Password" },
},
],
});
console.log(`Filling Email`);
await googlePage.waitForLoadState("domcontentloaded");
await googlePage.getByRole("textbox", { name: "Email or phone" }).fill(email);
await googlePage.getByText("Next").click();
await googlePage.waitForNavigation();
await googlePage.waitForTimeout(1000);
await googlePage
.getByRole("button", { name: "Try another way" })
.click()
.catch(() => {});
console.log(`Filling password`);
await googlePage
.getByRole("textbox", { name: "Enter your password" })
.fill(password);
console.log(`Password filled`);
await googlePage.getByText("Next").click();
await googlePage.waitForNavigation();
await googlePage.waitForLoadState("networkidle");
await googlePage.waitForTimeout(3000);
const has2FA = !!(
await googlePage.getByText("2-Step Verification").elementHandles()
).length;
if (has2FA) {
console.log(`Detected 2-Step Verification`);
const hasOptionalScreen = googlePage.getByText(
"fingerprint, face, or screen lock"
);
if (!!(await hasOptionalScreen.elementHandles()).length) {
console.log(`Detected extra 2fa screen, continuing`);
await googlePage.getByRole("button", { name: "Try another way" }).click();
await googlePage.waitForTimeout(1000);
}
const tabletOption = {
label: "Tap Yes on your phone or tablet",
locator: googlePage.getByRole("link", {
name: " on your phone or tablet",
}),
followup: async (_page, _code) => {},
prompt: async () => ({
title: "Confirm Access",
description: "Confirm you allowed access in your Google App",
prompts: [
{
label: "Confirm to continue",
type: "checkbox",
attributes: { required: true, options: ["Confirm"] },
},
],
}),
};
const authenticator = {
label: "Get a verification code from the Google Authenticator app",
locator: googlePage.getByRole("link", {
name: "Get a verification code from",
}),
followup: async (_page, _code) => {
await _page.getByRole("textbox", { name: "Enter code" }).fill(_code);
await _page.getByText("Next").click();
},
prompt: () => ({
title: "Enter one time code",
description: "Enter the one time code sent to your phone",
prompts: [{ label: "Code", type: "text", attributes: { length: 6 } }],
}),
};
const smsCode = {
label: "Get a verification code sent to your phone",
locator: googlePage.getByRole("link", {
name: "Get a verification code at",
}),
followup: async (_page, _code) => {
await _page
.getByRole("textbox", { name: "Enter the code" })
.fill(_code);
await _page.getByText("Next").click();
},
prompt: () => ({
title: "Enter one time code",
description: "Enter the one time code sent to your phone",
prompts: [{ label: "Code", type: "text", attributes: { length: 6 } }],
}),
};
const options = [tabletOption, authenticator, smsCode];
const enabledOptions = [];
for (const option of options) {
if ((await option.locator.elementHandles()).length) {
if (!(await option.locator.isDisabled())) {
enabledOptions.push(option);
}
}
}
if (enabledOptions.length > 1) {
console.log(`Prompting user with enabled 2-Factor options.`);
const [choice] = await session.prompt({
title: "Two Factor Authentication",
description: "Choose how you'd like to 2FA",
prompts: [
{
label: "Two Factor Method",
type: "checkbox",
attributes: {
multiple: false,
options: enabledOptions.map((o) => o.label),
},
},
],
});
console.log(`Selecting 2FA method`);
const selected = enabledOptions.find((i) => i.label === choice[0]);
await selected.locator.click();
console.log(`Prompting user for 2FA`);
const [code] = await session.prompt(await selected.prompt());
if (selected.followup) {
console.log(`Entering 2FA Code`);
await selected.followup(googlePage, code);
console.log(`2FA code entered`);
}
} else if (enabledOptions.length === 1) {
console.log(
`Found a single enabled option: ${await enabledOptions[0].locator.innerText()}`
);
await enabledOptions[0].locator.click();
console.log(`Prompting user for 2FA`);
const [code] = await session.prompt(await enabledOptions[0].prompt());
if (enabledOptions[0].followup) {
console.log(`Entering 2FA Code`);
await enabledOptions[0].followup(googlePage, code);
console.log(`2FA code entered`);
}
} else {
console.log(`No enabled options found.`);
// No 2FA options available (unexpected scenario)
}
} else {
console.log(`Google account did not require 2FA`);
}
};
// Go to Reddit login page and attempt login
console.log(`Navigating to ${LOGIN_URL}`);
await page.goto(LOGIN_URL, { waitUntil: "domcontentloaded" });
await handleSecurityDetection(); // check if Reddit immediately asks for OTP
switch (authType[0]) {
case AUTH_TYPES.GOOGLE: {
console.log(`Authing Google`);
await authGoogle();
await page.waitForLoadState("domcontentloaded");
await handleSecurityDetection();
break;
}
case AUTH_TYPES.USERNAME:
default: {
await authUsernamePass();
break;
}
}
console.log(`User Logged In`);
// Navigate to profile page
await page.waitForLoadState("domcontentloaded");
await page.getByRole("button", { name: /expand user menu$/i }).click();
await page.getByText("View Profile").click();
// Scrape Username and Karma
await page.waitForLoadState("domcontentloaded");
console.log(`User Profile Loaded`);
console.log(`Scraping Username`);
const handle = await page
.getByLabel("Profile information")
.getByRole("heading")
.first()
.innerText();
const [postKarma, commentKarma] = await page
.getByTestId("karma-number")
.allInnerTexts();
// Prove the data
await session.prove("reddit-data", [
{ redditUsername: handle },
{
totalKarma: (
Number.parseInt(postKarma.replace(",", ""), 10) +
Number.parseInt(commentKarma.replace(",", ""), 10)
).toLocaleString("en-US"),
},
{ postKarma },
{ commentKarma },
]);
await session.close();
Debugging and Development Tips
-
Be sure to try out our Script Editor. It has built-in logging, auto-completion, debugging, and examples.
-
When writing automation scripts for Pluto, you’ll inevitably encounter cases where things don’t work on the first try — a selector might be wrong, a page may present an unexpected prompt, or a script might get stuck waiting for something. Below are tips and best practices to debug and harden your scripts.
-
When looking for the existence of an element, if you only care if it exists in the DOM, and don’t care if the user is able to see it, use
.elementHandles().length
instead ofisVisible()
, as it is more reliable. -
Prefer finding locators by role, label, or test id, as these are more stable than by text or CSS selectors.
-
Be sure to check out the different ways Playwright is able to wait, such as
waitForLoadState
,waitForFunction
, andwaitForURL
. -
Build and test step-by-step: Develop your automation one page or step at a time. For example, first get the login working. Once that’s solid, add the next navigation and data scraping. This incremental approach makes it easier to isolate issues. As one of our developers put it: “Go page by page in your flow, making sure each step works before moving to the next.” If possible, modularize your code by breaking it into functions for each section (login, navigate, scrape, etc.), which you can test individually.
-
Use Chrome DevTools to inspect the live DOM: To reliably select elements, don’t rely on “View Page Source” – that often shows the initial HTML but not the dynamic state after scripts run. Instead, open the site in a real browser and use Inspect Element to examine the DOM and find unique attributes or accessible roles for your targets. Copy outer HTML or relevant snippets of the DOM. This is the HTML you want to base your selectors on.
Why not View Source? Many modern sites (including Reddit) load content dynamically or modify the DOM after initial load. The static source may omit crucial elements (especially if content is behind an authenticated session or requires scrolling to appear). Always inspect the live DOM when possible.
-
Leverage Playwright Codegen for quick selector hints: Playwright has a great tool to help generate selectors. Run
npx playwright codegen <url>
to open the site in a special mode where you can interact with it. As you click around, it will suggest Playwright code (in JavaScript) for those actions, including the selectors used. This can be invaluable for tricky elements. You can copy the selectors (or even entire actions) and adapt them into your Pluto script. Codegen uses Playwright’s best practices (like getByRole when applicable), giving you a head start.
💡 Tip: Using Playwright Codegen
When you run npx playwright codegen https://www.example.com
, a browser window and a code window will appear. Perform the steps you’re trying to automate (e.g., fill login form, click buttons). The code window will show lines like:
await page.getByRole("textbox", { name: "Email" }).fill("test@example.com");
await page.getByRole("textbox", { name: "Password" }).fill("mypassword");
await page.getByRole("button", { name: "Log In" }).click();
You can even change the language (Playwright supports Python, C#, etc., but we use JS). While you still need to integrate these into your Pluto script structure (with session.prompt, session.prove, etc.), codegen is excellent for discovering how to target elements or what Playwright’s recommended selector is for a given element. It can save time and ensure you use resilient selectors.
-
Logging is your friend: Use
console.log()
generously to trace the execution of your script. In the examples above, you see logs for each major step (e.g., “Filling password”, “Detected 2FA”, “Profile page loaded”). When running the script via Pluto Frame or even locally, these logs will help you understand where it might be failing or stuck. If a script stops unexpectedly, check the last log printed – it often indicates the step that didn’t complete. -
Use waits and timeouts appropriately: A common mistake is assuming something will happen instantly. Always wait for navigations (
page.waitForNavigation
orwaitForURL
or at leastwaitForLoadState
), and wait for elements to appear before interacting. If an element might take time to load (due to network or animations), usewaitForSelector
with a reasonable timeout orlocator.waitFor()
. If you’re not sure what to wait for, using a shortwaitForTimeout
as a crude delay can sometimes help, but prefer deterministic waits when possible (like waiting for a specific element or text that indicates readiness). If an action has a built-in promise (e.g.,click()
will wait for navigation by default in some cases), consider that too. -
Handling element not found errors: If your script throws an error like “No element found for selector…” or times out waiting for an element:
- Double-check the selector or locator. It might be incorrect or misspelled. Use DevTools inspector or codegen to verify the element’s attributes.
- Consider the context: Is the element inside an iframe or shadow DOM? (Playwright’s getByRole and other high-level selectors usually pierce iframes if same origin, but not cross-domain iframes.)
- Ensure that the step that loads that element was successful. For instance, if login failed, the profile page won’t ever load the element you expect. So the real issue might be earlier.
-
Dealing with different site layouts or modes: Some websites have very different DOM structure depending on user settings or agent:
- Reddit for example has an “Old Reddit” design (if a user opts for the old interface, or if you navigate to old.reddit.com). Our script assumes the new design. If you needed to support old Reddit, you’d have to use completely different selectors (the old site is not React-based and has different IDs/classes). One approach is to detect which version is loaded (e.g.,
if (await page.locator('body[class*="old-Reddit"]').count() > 0) { /* use old flow */ } else { /* new flow */ }
). Often, it’s easiest to ensure you use the version you expect by using specific URLs or settings.
- Reddit for example has an “Old Reddit” design (if a user opts for the old interface, or if you navigate to old.reddit.com). Our script assumes the new design. If you needed to support old Reddit, you’d have to use completely different selectors (the old site is not React-based and has different IDs/classes). One approach is to detect which version is loaded (e.g.,
-
Anticipating and handling 2FA/OTP flows: As seen, always consider that a user may have two-factor auth enabled. If you write a script for a site with optional 2FA, include checks for those elements and prompt the user for the code. It’s fine if the user doesn’t have 2FA; the check will just skip. But if you don’t include it and a user does have 2FA, your script will fail for them. Our Reddit and Google examples both show patterns for handling OTP. Generally:
- Check for an OTP input or prompt after login attempt.
- If present,
session.prompt
the user for the code (with an appropriate message so they know to get it from their device). - Possibly handle multiple methods if the site provides (as we did for Google).
- Testing tip: It can be tricky to test OTP flows because once you successfully verify a device, some sites won’t ask again for a while. You can often force it by using a fresh session/storage (Pluto’s sessions are isolated, but if you reuse the same session cache it might have a cookie marking the device as trusted). Use
session.clearCache()
or start a new session to simulate a first-time login for OTP testing.
-
Cloudflare or CAPTCHA blocks: A few sites might present anti-bot challenges (like Cloudflare IUAM page or reCAPTCHA) when they detect automation. This is a tough scenario. If your script consistently hits a CAPTCHA:
- Recognize that currently full automation may not be possible without solving the CAPTCHA. You might detect it (e.g., a URL containing /challenge or presence of CAPTCHA iframe) and then decide to fail gracefully or notify the user.
- We are working on integrating CAPTCHA solving in the platform, but until that’s available, you may have to avoid or manually intervene on such sites. Slowing down the navigation or using known headers might reduce detection, but there’s no guarantee.
- If it’s a Cloudflare browser challenge, sometimes simply waiting (as our code does in some examples) might get past it after a few seconds, or a retry might succeed. But be cautious: hitting these means the site is hostile to automation.
- Security note: Avoid any solutions that require exposing user credentials to third-party CAPTCHA solver services without user consent.
-
Security best practices:
- Never log sensitive information. Our examples log actions but never the actual passwords or codes. Be mindful if you catch errors or print variables.
- Use
type: 'password'
in prompts for secrets so that they are masked in the UI. - Clean up any credentials from memory if possible (though in a short-lived script this is less of an issue).
- Understand that while the automation runs in a secure enclave, any data you
console.log
orsession.prove
will leave that enclave in some form (attested output or logs). Only output what is necessary for your application logic. - If dealing with financial or personal data, make sure to handle it according to user consent and regulatory requirements. Pluto provides attestation to prove the data origin, but you as the developer should still treat the data carefully.
-
Testing and iteration: It’s a good idea to test your script with multiple accounts if possible (e.g., one with 2FA, one without, one with different settings like dark mode or old Reddit, etc.). This helps uncover edge cases. Keep in mind that sites update their UIs periodically – what works today might break if the site changes a button text or layout. Maintenance is part of writing automation scripts; having good logging and a clear structure will make it easier to update scripts when needed.
By following these practices, you’ll be able to create Pluto automation scripts that are robust and handle the many quirks of web automation. Happy scripting!
Summary
We covered how to write and debug Pluto automation scripts using Playwright, with a deep dive into two examples. The pseudo-bank example showed a straightforward login and data retrieval. The Reddit example demonstrated handling multiple login methods, 2FA, conditional flows, and data scraping. Along the way, we discussed strategies for selecting DOM elements reliably (using roles, labels, test ids), dealing with dynamic content, and anticipating security measures. We also provided tips for debugging, such as using npx playwright codegen
, console logging, and testing stepwise. By applying these examples and tips, developers can confidently build custom automations for a variety of sites and scenarios.