Technology
Born out of San Francisco AI hackathons, Agency lets you see what your AI agents are doing

After an extended week of coding, you might think that San Francisco builders would retreat to the mountains, beaches, or the Bay Area’s vibrant club scene. But in point of fact, because the week winds down, AI hackathons begin.
Over the past few years, San Francisco has exploded with AI hackathons. Every Saturday or Sunday, technologists give talks on the most recent advances in AI, networking, and—most significantly—turn ideas into working demos. Sometimes hackathons offer money or cloud credits as prizes, but the true winners walk away with a way of a startup.
“There’s no better place in the world to build the most ambitious project of your life than San Francisco,” says agency co-founder Alex Reibman. “You often see a lot of competitions—like hackathons—but they’re not competitive. They’re as collaborative as they are competitive.”
At a hackathon in San Francisco last summer, Reibman decided to try his hand at constructing AI agents that would crawl the net. Agents are a hot topic in Silicon Valley because the AI boom reaches its peak. The term just isn’t precisely defined, but it surely broadly describes AI bots that may perform tasks robotically using interfaces and services that weren’t originally designed for automation—a sort of alternative for mundane tasks that when required human intervention.
But Reibman immediately bumped into an issue. “They sucked,” Reibman said in an interview. “The agents failed 30 to 40 percent of the time, and often in unexpected ways.”
To fix this, Reibman’s team built internal debugging tools to see where their agents were going mistaken. They eventually managed to get the agents to work a little bit higher, however the debugging tools themselves ultimately stole the show and won the hackathon.
“I started showing the tools at a lot of hackathons and events in San Francisco, and people started asking for access to them,” Reibman said. “That was basically the confirmation I needed: instead of building an agent ourselves, we should build tools that make it easier to build agents.”
So Reibman founded Agency along with his co-founders Adam Silverman and Shawn Qiu, offering tools to look at what AI agents are actually doing and catch where they’re going mistaken. A yr later, those tools eventually became Agency’s core product, the AgentOps platform that 1000’s of teams use every month, Reibman tells TechCrunch. The startup has already raised $2.6 million in pre-seed funding, led by 645 Ventures and Afore Capital.
COO Adam Silverman tells TechCrunch that AgentOps is like “multiple device management for agents,” analyzing all agent actions to make sure they don’t go down a rogue path.
“You want to understand whether your agent is going to act dishonestly and determine what limitations you can put in place,” Silverman said in an interview. “A lot of the work is being able to visually see where your guardrails are and whether agents are abiding by them before you put them into production.”
The startup is partnering with Cohere and Mistral, AI modelers who also offer agent creation services, so customers can use the AgentOps dashboard to see how agents interact with the world and the way much each costs. Agency is model-agnostic, meaning it really works with several different AI agent frameworks, but it surely integrates with popular tools like Microsoft’s AutoGen, CrewAI, and AutoGPT.
In addition to the AgentOps dashboard, Agency also offers consulting services (Reibman previously worked at consulting firm EY) to assist firms start constructing agents. The agency wouldn’t disclose any clients by name, but said hedge funds, consultants, and marketing firms use its tools.
For example, Reibman says Agency helped create an AI agent that writes blog posts concerning the firms a client does business with. Now, that very same client uses the AgentOps dashboard to trace agent performance and costs.
Big players like OpenAI and Google are prone to ramp up their agent products in the approaching months, and AI startups like Agency need to search out a option to work with these advances, not against them.
“There are so many layers in the stack that it’s unlikely that an LLM vendor would try to cover all of them,” Reibman said. “OpenAI and Anthropic are building tools to create agents, but there are a lot of layers around them that make sure you have a production-ready code base.”
Technology
Benchmarks meta for new AI models are somewhat misleading

One of the new flagship AI Meta models released on Saturday, Maverick, Second rating at LM ArenaA test during which human rankings compare the outcomes of models and select which they like. But it appears that evidently the Maverick version, that the finish implemented on LM Arena differs from the version that’s widely available to programmers.
How several And researchers He pointed to X, Meta noticed within the announcement that Maverick on LM Arena is a “experimental version of the chat.” Chart on The official website of LlamaMeanwhile, it reveals that the testing of the LM META Arena was carried out using “Llama 4 Maverick optimized for conversation.”
As we wrote earlier, for various reasons LM Arena has never been essentially the most reliable measure of the performance of the AI model. But AI firms generally didn’t adapt or otherwise adapted their models to higher rating at LM Arena-Lub a minimum of didn’t admit it.
The problem related to adapting the model to the reference point, suspension of it, after which releasing the “vanilla” variant of the identical model, is that programmers are difficult to predict how good it can work in specific contexts. It can be misleading. It is best if the tests tests – miserably inadequate – provide a shutter of strong and weaknesses of 1 model in various tasks.
Indeed, scientists on X have Stark was observed Differences in behavior From publicly to download maverick in comparison with the hosted model on LM Arena. The LM Arena version seems to make use of many emoji and provides extremely long answers.
Okay, Lama 4 is Def and Littled cooked lol, what a yap city is that this city pic.twitter.com/y3gvhbvz65
– Nathan Lambert (@natolambert) April 6, 2025
For some reason, the Llam 4 model in the sector uses rather more emoji
together. Ai, it seems higher: pic.twitter.com/f74odx4zttt
– technological notes (@techdevnotes) April 6, 2025
We arrived at Meta and Chatbot Arena, a company that maintains LM Arena to comment.
(Tagstotransate) benchmark
Technology
Trump delays the ban

Donald Trump has signed a brand new executive order “Save Tiktok”.
Tiktok will live to see the next day – at the least for now. On April 4, President Donald Trump signed a brand new executive order delaying the ban on a preferred social application by one other 75 days. The application was to darken in the USA on April 5.
The application, belonging to the Chinese company Bytedance, is now on the second extension in the first quarter of the 12 months. In 2024, President Biden signed bilateral laws of Ban Tiktok, citing fears about national security. Congress voted in a predominant means. Although Trump has signed the executive order to “save” the application, many questioned the legality of the movement. Like many president’s actions at the starting of his term, they complain that evidently he exceeds the authority of the executive office.
Trump announced his move to Stop the ban on social truthSaying that his administration remains to be working on the contract.
“My administration worked very hard on the Tiktok saving contract, and we have made great progress,” Trump wrote on April 4. “The contract requires more work to ensure the signing of all necessary approvals, which is why I sign an executive order to continue tiktok for an additional 75 days.”
Trump quoted his newly imposed tariffs to China as a key reason for detained negotiations for the buyer.
“We hope to continue working in good faith with China, which, as I understand, are not very satisfied with our mutual tariffs – necessary for honest and balanced trade between China and the USA,” wrote Trump. “It proves that tariffs are the most powerful economic tool and very important for our national security. We do not want Tiktok to go dark. We are looking forward to cooperation with Tiktok and China to complete the contract.”
This means a second time Trump entered to delay the ban. On January 2, just a couple of days after returning to the office, he signed the first extension to stop Tiktok, utilized by over 170 million Americans available to users.
The potential sales of Tiktok draws the major attention of the principal players in the business world. According to HillMany private equity firms, the Venture Capital groups and the best technological investors have introduced offers for a preferred application.
Among the firms, apparently in the mix are Blackstone, Oracle, Amazon – led by Jeff Bezos – and the founding father of Onlyfans Tim Stokely. Interest in purchasing Tiktok has increased, how uncertainty about its future in the US is always growing.
The application, utilized by 170 million Americans, is situated at the center of ongoing political and economic negotiations between the United States and China. Along with the upcoming pressure and deadlines, the possibility of selling opened the door to the largest technological and financial names.
Technology
Doge is supposedly planning Hackathon to build a “mega api” for IRS data

The Department of Government Elon Musk (DOGE) is planning Organize Hackathon next week Focused on creating a “mega API interface”, which is able to provide access to taxpayers, according to Wired.
Wired claims that Hackathon is organized by two Doge employees within the service of the inner rule – Gavin Kliger and Sam Corcos, who’re also the final director at the extent of Healthtech startups. Corcos reportedly said to others in Doge that his goal is to build “one new API to rule them all.”
This would facilitate cloud suppliers access to IRS data, including taxpayers’ names, addresses, social insurance numbers, tax declarations and employment information, which may very well be exported to external systems. According to Wired, the vendor of external parties managed parts of the project, and Palantir “consistently” grew up as a candidate.
“Basically, they are open door controlled by Musk for the most sensitive information of all Americans without any rules that normally secure this data,” said an anonymous IRS worker said.
(Tagstranslate) dog
-
Press Release12 months ago
U.S.-Africa Chamber of Commerce Appoints Robert Alexander of 360WiseMedia as Board Director
-
Press Release1 year ago
CEO of 360WiSE Launches Mentorship Program in Overtown Miami FL
-
Business and Finance10 months ago
The Importance of Owning Your Distribution Media Platform
-
Business and Finance1 year ago
360Wise Media and McDonald’s NY Tri-State Owner Operators Celebrate Success of “Faces of Black History” Campaign with Over 2 Million Event Visits
-
Ben Crump12 months ago
Another lawsuit accuses Google of bias against Black minority employees
-
Theater1 year ago
Telling the story of the Apollo Theater
-
Ben Crump1 year ago
Henrietta Lacks’ family members reach an agreement after her cells undergo advanced medical tests
-
Ben Crump1 year ago
The families of George Floyd and Daunte Wright hold an emotional press conference in Minneapolis
-
Theater1 year ago
Applications open for the 2020-2021 Soul Producing National Black Theater residency – Black Theater Matters
-
Theater10 months ago
Cultural icon Apollo Theater sets new goals on the occasion of its 85th anniversary