IBM CloudReferenceintermediate

Watson AI & watsonx Guide

Build enterprise AI with watsonx.ai, Granite models, Watson Assistant, Watson Discovery, RAG architectures, and AI governance.

CloudToolStack Editorial26 min readPublished Mar 14, 2026

Prerequisites

Basic understanding of AI and machine learning concepts
IBM Cloud account with Watson service permissions

Watson AI and watsonx on IBM Cloud

IBM's AI portfolio has evolved significantly with the launch of watsonx, a next-generation AI and data platform built for enterprise. The watsonx platform consists of three integrated components: watsonx.ai (foundation model studio), watsonx.data (fit-for-purpose data store), and watsonx.governance (AI governance toolkit). Together with the established Watson services (Watson Assistant, Watson Discovery, Watson Speech), IBM offers one of the most comprehensive enterprise AI platforms available.

What distinguishes IBM's AI approach is the focus on enterprise governance, transparency, and trusted AI. IBM's Granite foundation models are trained on curated enterprise data with full provenance tracking, and IBM provides an IP indemnity shield for Granite models. The watsonx.governance platform enables organizations to monitor AI models for bias, drift, and quality, meeting regulatory requirements like the EU AI Act.

This guide covers the watsonx platform, Granite and third-party foundation models, Watson Assistant for conversational AI, Watson Discovery for enterprise search, prompt engineering, model tuning, RAG architectures, and AI governance.

watsonx.ai: Foundation Model Studio

watsonx.ai is the core AI platform for working with foundation models. It provides a prompt lab for experimenting with models, APIs for integrating AI into applications, and tools for tuning models on your own data. watsonx.ai supports both IBM Granite models and third-party models from Meta (Llama), Mistral, and others.

Granite Models

IBM Granite models are enterprise-optimized foundation models available in multiple sizes and specializations:

Granite 13B Chat: General-purpose conversational model for chatbots, question answering, and content generation.
Granite 13B Instruct: Instruction-following model for task completion, summarization, and extraction.
Granite 8B Code: Code generation and explanation model supporting 100+ programming languages.
Granite 3B: Lightweight model for low-latency applications and edge deployment.
Granite Embedding: Text embedding model for semantic search and RAG applications.

Granite IP Indemnity

IBM provides IP indemnity protection for Granite models, meaning IBM will defend customers against third-party IP infringement claims related to the output of Granite models when used as part of IBM's watsonx platform. This is a significant differentiator for enterprises concerned about legal risks from generative AI.

Using the watsonx.ai API

python

from ibm_watsonx_ai import APIClient
from ibm_watsonx_ai.foundation_models import ModelInference

# Initialize the client
credentials = {
    "url": "https://us-south.ml.cloud.ibm.com",
    "apikey": "<your-api-key>"
}
client = APIClient(credentials)

# Set the project or space
project_id = "<your-project-id>"

# Create a model inference instance
model = ModelInference(
    model_id="ibm/granite-13b-chat-v2",
    api_client=client,
    project_id=project_id,
    params={
        "max_new_tokens": 500,
        "temperature": 0.7,
        "top_p": 0.9,
        "repetition_penalty": 1.1
    }
)

# Generate text
response = model.generate_text(
    prompt="Explain the benefits of hybrid cloud architecture "
           "for financial services companies."
)
print(response)

Prompt Engineering

Effective prompt engineering is critical for getting high-quality results from foundation models. IBM recommends a structured approach to prompts:

System instruction: Define the AI's role and behavior constraints.
Context: Provide relevant background information.
Examples: Include few-shot examples of desired input-output pairs.
Task: Clearly state what you want the model to do.
Output format: Specify the expected format (JSON, markdown, bullet points).

text

System: You are an IBM Cloud solutions architect specializing in
financial services compliance. Respond with specific IBM Cloud
services and configurations.

Context: A regional bank needs to migrate their core banking
application to the cloud while maintaining FFIEC compliance.

Task: Recommend an IBM Cloud architecture that addresses:
1. Data residency requirements
2. Encryption at rest and in transit
3. Audit logging and monitoring
4. High availability and disaster recovery

Output Format: Provide a numbered list of recommendations with
the specific IBM Cloud service name and configuration for each.

Model Tuning

watsonx.ai supports prompt tuning and fine-tuning to customize foundation models for your specific domain and use cases. Prompt tuning adds a small number of tunable parameters (soft prompts) that are prepended to input prompts, adapting the model's behavior without modifying the base model weights.

python

from ibm_watsonx_ai.foundation_models.utils import TuneExperimentConfig

# Define a prompt tuning experiment
config = TuneExperimentConfig(
    name="banking-qa-tuner",
    base_model="ibm/granite-13b-instruct-v2",
    task_type="classification",
    training_data_reference={
        "type": "connection_asset",
        "location": {"path": "banking-qa-training.jsonl"}
    },
    parameters={
        "num_epochs": 20,
        "learning_rate": 0.01,
        "batch_size": 16,
        "accumulate_steps": 1
    }
)

# Start the tuning experiment
experiment = client.foundation_models.TuneExperiment(config)
experiment.run()

RAG (Retrieval-Augmented Generation)

RAG is the recommended pattern for building AI applications that need to answer questions about your organization's private data. Instead of fine-tuning a model on your data, RAG retrieves relevant documents at query time and includes them in the prompt context. This approach provides up-to-date answers, reduces hallucinations, and maintains data security because your data is never used for model training.

A typical IBM Cloud RAG architecture uses:

Watson Discovery: Enterprise search and document understanding for retrieving relevant passages.
Elasticsearch on IBM Cloud: Vector search for semantic similarity matching.
Granite Embedding: Convert documents and queries into vector embeddings.
watsonx.ai: Generate answers using retrieved context and a Granite or Llama model.
Cloud Object Storage: Store source documents and knowledge base assets.

Watson Assistant

Watson Assistant is IBM's enterprise conversational AI platform for building chatbots and virtual agents. Watson Assistant combines traditional dialog management (intent recognition, entity extraction, dialog flows) with generative AI capabilities powered by watsonx.ai foundation models.

Key capabilities include:

Actions: No-code conversational flows with automatic intent recognition.
Conversational Search: Integration with Watson Discovery for answering questions from your knowledge base.
Custom Extensions: Connect to external APIs and back-end systems.
Multi-channel: Deploy on web, phone, SMS, Slack, Microsoft Teams, and WhatsApp.
Analytics: Conversation analytics, containment rates, and customer satisfaction tracking.

Watson Discovery

Watson Discovery is an AI-powered enterprise search and content intelligence platform. It ingests documents in various formats (PDF, Word, HTML, JSON), applies natural language processing to extract entities, relationships, and sentiments, and provides relevance-ranked search results. Discovery is commonly used as the retrieval component in RAG architectures and as the knowledge base for Watson Assistant conversational search.

bash

# Create a Watson Discovery instance
ibmcloud resource service-instance-create my-discovery \
  discovery plus us-south

watsonx.governance

watsonx.governance provides the tooling to manage the AI lifecycle with transparency, accountability, and compliance. It addresses the growing regulatory requirements around AI, including the EU AI Act, by providing:

Model Inventory: Central registry of all AI models with metadata, lineage, and deployment status.
Bias Detection: Automated monitoring for fairness across protected attributes (age, gender, race).
Drift Detection: Monitor model performance degradation over time and alert when retraining is needed.
Explainability: Generate explanations for individual predictions to support regulatory compliance.
Risk Management: AI risk assessment workflows with approval gates and compliance checklists.
Factsheet: Automated documentation of model training data, metrics, and deployment history.

AI Governance Regulations

The EU AI Act requires organizations to implement risk management systems, maintain technical documentation, ensure transparency, and enable human oversight for high-risk AI systems. IBM watsonx.governance is designed to help organizations meet these requirements. Start governance early in your AI development process rather than retrofitting it before deployment.

Best Practices

Start with Granite models for IBM Cloud workloads due to IP indemnity and enterprise optimizations.
Use RAG for domain-specific Q&A rather than fine-tuning to maintain data freshness and security.
Implement prompt guardrails to prevent misuse and ensure output quality.
Monitor model outputs with watsonx.governance for bias, drift, and quality.
Use Granite Embedding for vector search to maintain consistency across the IBM platform.
Version and test prompts systematically using watsonx.ai prompt lab.
Implement rate limiting and cost tracking for API-based model consumption.
Use Watson Assistant with conversational search for customer-facing AI interactions.
Store model artifacts and training data in Cloud Object Storage with versioning.
Maintain an AI model inventory with full lineage tracking from the start.

Key Takeaways

1Granite models are enterprise-optimized with IP indemnity protection for commercial use.
2watsonx.ai supports prompt tuning to customize models without modifying base model weights.
3RAG with Watson Discovery provides enterprise Q&A without fine-tuning on private data.
4watsonx.governance monitors AI for bias, drift, and quality to meet EU AI Act requirements.

Frequently Asked Questions

What is the difference between watsonx.ai and Watson Studio?

watsonx.ai is the next-generation AI platform focused on foundation models (LLMs), prompt engineering, and model tuning. Watson Studio is the traditional ML platform for building, training, and deploying custom ML models. Both are accessible through IBM Cloud but watsonx.ai is the recommended platform for generative AI use cases.

Can I use non-IBM models on watsonx?

Yes, watsonx.ai hosts both IBM Granite models and third-party models including Meta Llama 3, Mistral, and others. You can use any supported model through the same API and prompt lab interface. Granite models include IP indemnity; third-party model terms vary by provider.

Written by CloudToolStack Editorial

Written and reviewed by the CloudToolStack editorial team. Every guide is verified against current provider documentation and revised in place when providers change pricing, deprecate services, or release meaningfully better alternatives.

Disclaimer: This guide is for educational purposes. Cloud services change frequently; always refer to official documentation for the latest information. AWS, Azure, and GCP are trademarks of their respective owners.