\u003Cfigcaption>\u003Cstrong>\u003Cem>You can’t tell me “Megatron-Turing” doesn’t sound really frickin’ cool\u003C/em>\u003C/strong>\u003C/figcaption>\u003C/figure>\n\u003C!-- /wp:image -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>You can think of a parameter as comparable to a single synapse within a human brain. \u003Ca href=\"https://youtu.be/eAn_oiZwUXA?t=2929\">Nvidia estimates\u003C/a> that by 2023 it will have developed a model that matches the average human brain parameter-for-synapse at 100 trillion parameters. To support these massive models, Nvidia just announce their \u003Ca href=\"https://blogs.nvidia.com/blog/2022/03/22/h100-transformer-engine/\">Hopper engine\u003C/a>, which can train these massive models up to six times faster. \u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>While model size isn’t the only factor in measuring the intelligence of an NLP model (see the controversy surrounding several existing \u003Ca href=\"https://analyticsindiamag.com/the-controversy-behind-microsoft-nvidias-megatron-turing-scale/#:~:text=What%20that%20means%20is%2C%20Megatron,NVIDIA%20offered%20several%20results%2Foutcomes.\">trillion-plus parameter models\u003C/a>), it’s undoubtedly important. The more parameters an NLP model can understand, the greater the odds it will be able to decipher and interpret user queries—particularly when they are complicated or include more than one intent.\u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:heading {\"level\":3} -->\n\u003Ch3 id=\"h-tooling\">Tooling\u003C/h3>\n\u003C!-- /wp:heading -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>The evolution of frameworks and libraries such as \u003Ca href=\"https://pytorch.org/\">PyTorch\u003C/a>, \u003Ca href=\"https://www.tensorflow.org/\">TensorFlow\u003C/a>, and others makes it faster and easier to build powerful learning models. \u003Ca href=\"https://pytorch.org/blog/pytorch-1.11-released/\">Recent\u003C/a> \u003Ca href=\"https://github.com/tensorflow/tensorflow/releases\">versions\u003C/a> have made it simpler to create complex models and run deterministic model training. \u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>These toolsets were initially developed by world leaders in AI/ML—Pytorch was created by Facebook’s AI Research Lab (FAIR) and TensorFlow by the Google Brain team—and have subsequently been made open-source. These projects are actively maintained and provide proven resources that can save years of development time, allowing teams to build sophisticated chatbots without needing advanced AI, ML, and NLP skills.\u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>Since then, new tools have further accelerated the power of NLP models. For those wanting the power of these tools without the burden of configuring them, MLOps platforms like \u003Ca href=\"https://wandb.ai/site\">Weights & Biases\u003C/a> provide a full service platform for model optimization, training, and experiment tracking. As the ML field becomes more sophisticated, more powerful tooling will come along. \u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:heading {\"level\":3} -->\n\u003Ch3 id=\"h-parallel-computing-hardware\">Parallel computing hardware\u003C/h3>\n\u003C!-- /wp:heading -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>Whereas a CPU provides general purpose processing for any given function, GPUs evolved to process a large number of simple mathematical transformations in parallel. This massively parallel computation capability make it \u003Ca href=\"https://www.weka.io/blog/gpus-for-machine-learning/\">ideal for NLP\u003C/a>. Specialized hardware such as \u003Ca href=\"https://en.wikipedia.org/wiki/Tensor_Processing_Unit\">TPUs\u003C/a> and \u003Ca href=\"https://en.wikipedia.org/wiki/AI_accelerator\">NPUs/AI accelerators\u003C/a> have taken these capabilities and created specialized hardware for ML and AI applications. \u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>As hardware grows in power, it becomes faster and cheaper to build and operate large NLP models. For those of us who aren’t shelling out the money for these powerful chipsets, many cloud providers are offering compute time on their own specialized servers. \u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:heading {\"level\":3} -->\n\u003Ch3 id=\"h-datasets\">Datasets\u003C/h3>\n\u003C!-- /wp:heading -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>NLP datasets have grown exponentially, partly due to the open-sourcing of commercially built and trained datasets by companies like Microsoft, Google, and Facebook. These datasets are a huge asset when building NLP models, as they contain the highest volume of user queries ever assembled. New communities like HuggingFace have arisen to share effective models with the larger community. \u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>To see the effect of these datasets, look no further than \u003Ca href=\"https://towardsdatascience.com/the-quick-guide-to-squad-cae08047ebee\">SQuAD\u003C/a>, the Stanford Question Answering Database. When SQuAD was first released in 2016, it seemed an impossible task to build an NLP model that could score well against SQuAD. Today, this task considered easy, and \u003Ca href=\"https://www.researchgate.net/publication/318315707_Long-Term_Memory_Networks_for_Question_Answering\">many models achieve very high accuracy\u003C/a>. \u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>As a result, new test datasets challenge NLP model creators. There’s \u003Ca href=\"https://rajpurkar.github.io/SQuAD-explorer/\">SQuAD 2.0\u003C/a>, which was meant to be a more difficult version of the original, but even that is becoming easy for current models. New datasets like \u003Ca href=\"https://gluebenchmark.com/\">GLUE and SuperGLUE\u003C/a> now offer multi-sentence challenges to give cutting edge NLP models a challenge. \u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:heading -->\n\u003Ch2 id=\"h-should-you-build-or-buy\">Should you build or buy?\u003C/h2>\n\u003C!-- /wp:heading -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>In hearing about all these advances in AI, ML, NLP, and related technologies, you may think it’s time to chuck out your chatbot and build a new one. You’re probably right. But there are fundamentally two solutions for development teams:\u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:list {\"ordered\":true} -->\n\u003Col>\u003Cli>Build a chatbot from the ground up to incorporate today’s superior technologies.\u003C/li>\u003Cli>Purchase a toolset that abstracts the difficult NLP side of things—ideally with some additional features—and build from there.\u003C/li>\u003C/ol>\n\u003C!-- /wp:list -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>This is the classic “build or buy” dilemma, but in this case, the answer is simpler than you might think. \u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>For a smaller development team with limited resources, building a chatbot from scratch to incorporate the latest AI, ML, and NLP concepts requires great talent and a lot of work. Skills in these areas are hard (and expensive) to come by, and most developers would prefer not to spend years acquiring them.\u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>What about development teams at larger organizations with resources to hire data scientists and AI/ML/NLP specialists? I believe it still likely isn’t worthwhile to build from scratch.\u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>Imagine a big bank with a dedicated team working on its latest chatbot, including five data scientists working on a custom NLP pipeline. The project takes perhaps 18 months to produce a usable chatbot—but by that time, advances in open-source tooling and resources have already caught up with anything new the team has built. As a result, there’s no discernible ROI from the project compared to working with a commercially available toolset.\u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>Worse, because the chatbot relies on a custom NLP pipeline, there’s no simple way to incorporate further advances in NLP or related technologies. Doing so will require considerable effort, further reducing the project’s ROI.\u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>I confess I am biased, but I honestly believe that building, maintaining, and updating NLP models is simply too difficult, too resource-intensive, and too slow to be worthwhile for most teams. It would be like building your own cloud infrastructure as a startup, rather than piggybacking on a big provider with cutting edge tooling and near infinite scale.\u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>What’s the alternative?\u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>A toolset like Botpress can abstract the NLP side of things and provide an IDE for developers to build chatbots without hiring or learning new skills—or building the tooling they need from scratch. This can provide a series of benefits for chatbot projects:\u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:list -->\n\u003Cul>\u003Cli>Significantly reduced development time.\u003C/li>\u003Cli>Easy upgrades to the latest NLP technologies without significant reworking.\u003C/li>\u003Cli>Less effort to maintain chatbots as updates are automatic.\u003C/li>\u003C/ul>\n\u003C!-- /wp:list -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>Best of all, developers can focus on building and improving the experience and functionality of their own software—not learning AI/ML/NLP.\u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:heading -->\n\u003Ch2 id=\"h-start-building-chatbots-today\">Start building chatbots today\u003C/h2>\n\u003C!-- /wp:heading -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>If I’ve piqued your interest in building chatbots, you can start right now. At Botpress, we provide an open-source developer platform you can download and run locally in under a minute.\u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>To get started, visit our \u003Ca href=\"https://botpress.com/for-developers\">chatbot developer page\u003C/a>. For a walkthrough on how to install the platform and build your first chatbot, refer to our \u003Ca href=\"https://botpress.com/blog/getting-started-with-botpress\">getting started with Botpress guide\u003C/a>.\u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>You can also test out the live demo of our latest product—a radically new method of creating knowledge-based, “intentless” chatbots, called \u003Ca href=\"http://openbook.botpress.com\">OpenBook\u003C/a>, announced this week. \u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:separator -->\n\u003Chr class=\"wp-block-separator has-alpha-channel-opacity\"/>\n\u003C!-- /wp:separator -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>\u003Cem>The Stack Overflow blog is committed to publishing interesting articles by developers, for developers. From time to time that means working with companies that are also clients of Stack Overflow’s through our advertising, talent, or teams business. When we publish work from clients, we’ll identify it as Partner Content with tags and by including this disclaimer at the bottom.\u003C/em>\u003C/p>\n\u003C!-- /wp:paragraph -->","html","2022-04-13T13:55:59.000Z",{"current":915},"will-chatbots-ever-live-up-to-the-hype",[917,926,931,936,941,946,948],{"_createdAt":918,"_id":919,"_rev":920,"_type":921,"_updatedAt":922,"slug":923,"title":925},"2023-05-23T16:43:21Z","wp-tagcat-ai","fpDTFQqIDjNJIbHDKPBGpV","blogTag","2025-01-30T16:19:01Z",{"current":924},"ai","AI",{"_createdAt":918,"_id":927,"_rev":928,"_type":921,"_updatedAt":918,"slug":929,"title":930},"wp-tagcat-chatbots","9HpbCsT2tq0xwozQfkc4ih",{"current":930},"chatbots",{"_createdAt":918,"_id":932,"_rev":928,"_type":921,"_updatedAt":918,"slug":933,"title":935},"wp-tagcat-code-for-a-living",{"current":934},"code-for-a-living","Code for a Living",{"_createdAt":918,"_id":937,"_rev":928,"_type":921,"_updatedAt":918,"slug":938,"title":940},"wp-tagcat-nlp",{"current":939},"nlp","NLP",{"_createdAt":918,"_id":942,"_rev":928,"_type":921,"_updatedAt":918,"slug":943,"title":945},"wp-tagcat-partner-content",{"current":944},"partner-content","Partner Content",{"_createdAt":918,"_id":942,"_rev":928,"_type":921,"_updatedAt":918,"slug":947,"title":945},{"current":944},{"_createdAt":918,"_id":949,"_rev":928,"_type":921,"_updatedAt":918,"slug":950,"title":951},"wp-tagcat-partnercontent",{"current":951},"partnercontent","Will chatbots ever live up to the hype?",[954,960,966,972],{"_id":955,"publishedAt":956,"slug":957,"sponsored":12,"title":959},"370eca08-3da8-4a13-b71e-5ab04e7d1f8b","2025-08-28T16:00:00.000Z",{"_type":10,"current":958},"moving-the-public-stack-overflow-sites-to-the-cloud-part-1","Moving the public Stack Overflow sites to the cloud: Part 1",{"_id":961,"publishedAt":962,"slug":963,"sponsored":904,"title":965},"e10457b6-a9f6-4aa9-90f2-d9e04eb77b7c","2025-08-27T04:40:00.000Z",{"_type":10,"current":964},"from-punch-cards-to-prompts-a-history-of-how-software-got-better","From punch cards to prompts: a history of how software got better",{"_id":967,"publishedAt":968,"slug":969,"sponsored":12,"title":971},"65472515-0b62-40d1-8b79-a62bdd2f508a","2025-08-25T16:00:00.000Z",{"_type":10,"current":970},"making-continuous-learning-work-at-work","Making continuous learning work at work",{"_id":973,"publishedAt":974,"slug":975,"sponsored":12,"title":977},"1b0bdf8c-5558-4631-80ca-40cb8e54b571","2025-08-21T14:00:25.054Z",{"_type":10,"current":976},"research-roadmap-update-august-2025","Research roadmap update, August 2025",{"count":979,"lastTimestamp":980},19,"2023-05-25T09:47:51Z",["Reactive",982],{"$sarticleModal":983},false,["Set"],["ShallowReactive",986],{"sanity-u-vgT4a_H8vm3Ja6PK-N_5cb-gPjtht7aodPkx4Bh8c":-1,"sanity-comment-wp-post-19939-1756387907146":-1},"/2022/04/13/will-chatbots-ever-live-up-to-the-hype"]