機械学習における専門家の混合（MoE）とは何か？

このガイドでは、ミックス・オブ・エキスパートについて学ぶことができる：

MoEとは何か、従来のモデルとの違いは何か
使用するメリット
ステップ・バイ・ステップのチュートリアル

さあ、飛び込もう！

MoEとは？

MoE（Mixture of Experts）とは、より大きなシステムの中に、複数の専門化されたサブモデル「エキスパート」を組み合わせた機械学習アーキテクチャである。各エキスパートは、タスクの異なる側面や異なるタイプのデータを扱うことを学習する。

このアーキテクチャの基本コンポーネントは「ゲーティング・ネットワーク」または「ルーター」である。このコンポーネントは、どのエキスパート（またはエキスパートの組み合わせ）が特定の入力を処理すべきかを決定する。ゲーティング・ネットワークはまた、各エキスパートの出力に重みを割り当てる。重みはスコアのようなもので、各エキスパートの結果がどの程度の影響力を持つべきかを示す。

簡単に言えば、ゲーティング・ネットワークは、最終的な答えに対する各エキスパートの貢献度を調整するために重みを使用する。そのために、入力の特徴を考慮する。これによってシステムは、単一のモデルよりも多くの種類のデータをうまく扱うことができる。

MoEと従来の密集モデルの違い

ニューラルネットワークの文脈では、従来の密なモデルはMoEとは異なる方法で動作する。密なモデルは、入力されたどのような情報に対しても、その内部パラメータのすべてを使用して計算を実行する。したがって、すべての入力に対して、計算機構のすべての部分が使用される。

主なポイントは、密なモデルでは、すべてのタスクにすべての部分が関与するということである。これはMoEとは対照的で、MoEでは関連する専門家のサブセクションのみが活性化される。

以下は、萌え系と濃厚系の主な違いである：

パラメータの使用法：
- 密なモデル：与えられた入力に対して、モデルはすべてのパラメータを計算に使用する。
- MoEモデル：与えられた入力に対して、モデルは選択されたエキスパートとゲーティングネットワークのパラメータのみを使用する。したがって、MoEモデルが多数のパラメータを持つ場合、1回の計算では、これらのパラメータの一部のみがアクティブになります。
計算コスト：
- 密なモデル：密な層の計算量は、すべての入力に対して一定である。
- MoEモデル：MoEレイヤーを通して入力を処理するための計算コストは、同程度の総パラメータサイズの密なレイヤーよりも低くすることができる。それは、モデルのサブセット（選ばれたエキスパート）のみが処理を行うからである。これにより、MoEモデルは、個々の入力の計算コストを比例して増加させることなく、はるかに多くの総パラメータ数に拡張することができます。
専門化と学習：
- 密なモデル：密な層のすべての部分は、それが遭遇するすべてのタイプの入力の処理に貢献するように学習する。
- MoEモデル：異なる専門家ネットワークは、専門化することを学ぶことができる。例えば、ある専門家は歴史に関する質問を処理するのが得意になり、別の専門家は科学的概念を専門とするかもしれない。ゲーティングネットワークは、入力のタイプを識別し、最も適切な専門家にそれをルーティングすることを学習する。これにより、より微妙で効果的な処理が可能になる。

ミックスド・エキスパート・アーキテクチャーの利点

MoEアーキテクチャは、現代のAI、特にLLMを扱う際に大きな意味を持つ。というのも、MoEアーキテクチャは、使用中の計算コストを比例して増加させることなく、モデルの容量（情報を学習・保存する能力）を増加させる方法を提供するからである。

AIにおけるMoEの主な利点は以下の通りである：

推論待ち時間の短縮：MoEモデルは、推論レイテンシーと呼ばれる予測や出力を生成するのに必要な時間を短縮することができます。これは、最も関連性の高いエキスパートだけをアクティブにする能力のおかげである。
トレーニングのスケーラビリティと効率の向上：AIのトレーニングプロセスにおいて、MoEアーキテクチャの並列性を活用できます。異なるエキスパートを、多様なデータサブセットや特殊なタスクに対して同時にトレーニングすることができます。これにより、収束とトレーニング時間の短縮が可能になります。
モデルのモジュール性と保守性の向上：エキスパートサブネットワークの離散的な性質は、モデル開発と保守へのモジュール式アプローチを容易にする。モデル全体の完全な再トレーニングを必要とすることなく、個々のエキスパートを独立して更新、再トレーニング、または改良バージョンに置き換えることができる。これにより、新しい知識や能力の統合が容易になり、特定のエキスパートのパフォーマンスが低下した場合に、より的を絞った介入が可能になります。
解釈可能性を高める可能性：専門家の専門性によって、モデルの意思決定プロセスに対するより明確な洞察が得られる可能性がある。どの専門家が特定の入力に対して一貫して活性化されるかを分析することで、モデルがどのように問題空間の分割と関連性の属性を学習したかを知る手がかりを提供することができる。この特性は、モノリシックな密なネットワークと比較して、複雑なモデルの動作をよりよく理解するための潜在的な方法を提供する。
スケールにおけるエネルギー効率の向上：MoEベースのモデルは、従来の密なモデルと比較して、クエリあたりのエネルギー消費を抑えることができる。これは、入力ごとに利用可能なパラメータのごく一部しか使用しないため、推論中にパラメータがスパースに活性化されるためである。

MoEの導入方法：ステップ・バイ・ステップ・ガイド

このチュートリアルでは、MoE の使い方を学びます。特に、スポーツニュースを含むデータセットを使用します。MoE は、以下のモデルに基づいて2人のエキスパートを活用します：

sshleifer/distilbart-cnn-6-6：各ニュースの内容を要約すると。
distilbert-base-uncased-finetuned-sst-2-english：各ニュースのセンチメントを計算する。センチメント分析では、「センチメント」はテキストに表現された感情的なトーン、意見、態度を指す。出力は
- ポジティブ：好意的な意見、幸福感、満足感を表す。
- 否定的：好ましくない意見、悲しみ、怒り、不満を表す。
- 中立：強い感情や意見はなく、事実であることが多い。

プロセスの最後に、各ニュースはJSONファイルに保存される：

ID、見出し、URL。
内容の要約。
コンテンツのセンチメントと信頼スコア。

ニュースを含むデータセットは、100以上のドメインから構造化されたウェブデータをリアルタイムで取得するスクレイピングに特化したエンドポイントであるBright DataのウェブスクレイパーAPIを使って取得することができる。

入力JSONデータを含むデータセットは、ガイド「Understanding Vector Databases」のコードを使用して生成できます：現代のAIを支えるエンジン“具体的には、”Practical Integration：ステップ・バイ・ステップ・ガイド」の章のステップ1を参照してください。

news-data.jsonと呼ばれる入力JSONデータセットは、以下のようなニュース項目の配列を含んでいる：

[
  {
    "id": "c787dk9923ro",
    "url": "https://www.bbc.com/sport/tennis/articles/c787dk9923ro",
    "author": "BBC",
    "headline": "Wimbledon plans to increase 'Henman Hill' capacity and accessibility",
    "topics": [
      "Tennis"
    ],
    "publication_date": "2025-04-03T11:28:36.326Z",
    "content": "Wimbledon is planning to renovate its iconic 'Henman Hill' and increase capacity for the tournament's 150th anniversary. Thousands of fans have watched action on a big screen from the grass slope which is open to supporters without show-court tickets. The proposed revamp - which has not yet been approved - would increase the hill's capacity by 20% in time for the 2027 event and increase accessibility. It is the latest change planned for the All England Club, after a 39-court expansion was approved last year. Advertisement "It's all about enhancing this whole area, obviously it's become extremely popular but accessibility is difficult for everyone," said four-time Wimbledon semi-finalist Tim Henman, after whom the hill was named. "We are always looking to enhance wherever we are on the estate. This is going to be an exciting project."",
    "videos": [],
    "images": [
      {
        "image_url": "https://ichef.bbci.co.uk/ace/branded_sport/1200/cpsprodpb/31f9/live/0f5b2090-106f-11f0-b72e-6314f702e779.jpg",
        "image_description": "Main image"
      },
      {
        "image_url": "https://ichef.bbci.co.uk/ace/standard/2560/cpsprodpb/31f9/live/0f5b2090-106f-11f0-b72e-6314f702e779.jpg",
        "image_description": "A render of planned improvements to Wimbledon's Henman Hill"
      }
    ],
    "related_articles": [
      {
        "article_title": "Live scores, results and order of playLive scores, results and order of play",
        "article_url": "https://www.bbc.com/sport/tennis/scores-and-schedule"
      },
      {
        "article_title": "Get tennis news sent straight to your phoneGet tennis news sent straight to your phone",
        "article_url": "https://www.bbc.com/sport/articles/cl5q9dk9jl3o"
      }
    ],
    "keyword": null,
    "timestamp": "2025-05-19T15:03:16.568Z",
    "input": {
      "url": "https://www.bbc.com/sport/tennis/articles/c787dk9923ro",
      "keyword": ""
    }
  },
  // omitted for brevity...
]

以下の手順に従って、MoEのサンプルを作成してください！

前提条件と依存関係

このチュートリアルを再現するには、マシンにPython 3.10.1以降がインストールされている必要があります。

プロジェクトのメインフォルダーをmoe_project/ と呼ぶとします。このステップが終わると、フォルダは以下のような構造になります：

moe_project/
├── venv/
├── news-data.json
└── moe_analysis.py

どこでだ：

venv/にはPython仮想環境が含まれる。
news-data.jsonは、Web Scraper API でスクレイピングしたニュースデータを含む入力 JSON ファイルです。
moe_analysis.pyはコーディングロジックを含むPythonファイルです。

venv/ 仮想環境ディレクトリは次のように作成する：

python -m venv venv

アクティベートするには、ウィンドウズで以下を実行する：

venvScriptsactivate

同様に、macOSとLinuxでは、以下を実行する：

source venv/bin/activate

アクティベートされた仮想環境で、以下の方法で依存関係をインストールする：

pip install transformers torch

これらの図書館は以下の通りである：

transformers: Hugging Faceの最先端機械学習モデル用ライブラリ。
トーチPyTorchはオープンソースの機械学習フレームワークです。

ステップ1：セットアップと設定

必要なライブラリをインポートし、いくつかの定数を設定することでmoe_analysis.pyファイルを初期化します：

import json
from transformers import pipeline

# Define the input JSON file
JSON_FILE = "news-data.json"
# Specify the model for generating summaries
SUMMARIZATION_MODEL = "sshleifer/distilbart-cnn-6-6"
# Specify the model for analyzing sentiment
SENTIMENT_MODEL = "distilbert-base-uncased-finetuned-sst-2-english"

このコードはこう定義している：

スクレイピングされたニュースの入力JSONファイル名。
専門家のためのモデル

完璧です！あなたはPythonでMoEを始めるために必要なものを持っています。

ステップ2：ニュース要約の専門家を定義する

このステップでは、ニュースを要約するエキスパートの機能をカプセル化したクラスを作成する：

class NewsSummarizationLLMExpert:
    def __init__(self, model_name=SUMMARIZATION_MODEL):
        self.model_name = model_name
        self.summarizer = None

        # Initialize the summarization pipeline
        self.summarizer = pipeline(
            "summarization",
            model=self.model_name,
            tokenizer=self.model_name,
        )

    def analyze(self, article_content, article_headline=""):
        # Call the summarizer pipeline with the article content
        summary_outputs = self.summarizer(
            article_content,
            max_length=300,
            min_length=30,
            do_sample=False
        )
        # Extract the summary text from the pipeline's output
        summary = summary_outputs[0]["summary_text"]
        return { "summary": summary }

上記のコード：

Hugging Faceのpipeline()メソッドで要約パイプラインを初期化する。
要約の専門家がanalyze()メソッドで記事を処理する方法を定義する。

いいね！ニュースの要約を担当するMoEアーキテクチャの最初のエキスパートを作ったところだ。

ステップ#3: センチメント分析エキスパートの定義

要約の専門家と同様に、ニュースのセンチメント分析を行うための特別なクラスを定義する：

class SentimentAnalysisLLMExpert:
    def __init__(self, model_name=SENTIMENT_MODEL):
        self.model_name = model_name
        self.sentiment_analyzer = None 

        # Initialize the sentiment analysis pipeline
        self.sentiment_analyzer = pipeline(
            "sentiment-analysis",
            model=self.model_name,
            tokenizer=self.model_name,
        )

    def analyze(self, article_content, article_headline=""):
        # Define max tokens
        max_chars_for_sentiment = 2000
        # Truncate the content if it exceeds the maximum limit
        truncated_content = article_content[:max_chars_for_sentiment]
        # Call the sentiment analyzer pipeline
        sentiment_outputs = self.sentiment_analyzer(truncated_content)
        # Extract the sentiment label
        label = sentiment_outputs[0]["label"]
        # Extract the sentiment score
        score = sentiment_outputs[0]["score"]
        return { "sentiment_label": label, "sentiment_score": score }

このスニペット：

pipeline() メソッドでセンチメント分析パイプラインを初期化します。
センチメント分析を実行するanalyze()メソッドを定義します。また、センチメントラベル (否定または肯定) と信頼度スコアも返します。

よろしい！これで、ニュースの文章のセンチメントを計算し、表現する専門家がまた一人増えたことになる。

ステップ4：ゲーティング・ネットワークの導入

次に、エキスパートをルーティングするゲートネットワークのロジックを定義しなければならない：

def route_to_experts(item_data, experts_registry):
    chosen_experts = []
    # Select the summarizer and sentiment analyzer
    chosen_experts.append(experts_registry["summarizer"])
    chosen_experts.append(experts_registry["sentiment_analyzer"])
    return chosen_experts

この実装では、ゲーティング・ネットワークは単純である。ニュース項目ごとに常に両方の専門家を使うが、順次そうする：

本文の要約である。
センチメントを計算する。

注：この例では、ゲーティング・ネットワークは非常に単純である。同時に、もし単一の大きなモデルを使って同じ目標を達成しようと思ったら、かなり多くの計算が必要になるだろう。対照的に、2人のエキスパートは、それぞれに関連するタスクにのみ活用される。このため、Mixture of Expertsアーキテクチャのシンプルかつ効果的な応用となる。

他のシナリオでは、特定のエキスパートをいつどのように活性化させるかを学習するMLモデルをトレーニングすることで、プロセスのこの部分を改善することができる。そうすることで、ゲーティング・ネットワークが動的に対応できるようになる。

素晴らしい！ゲーティング・ネットワークのロジックがセットアップされ、操作の準備が整った。

ステップ#5：ニュースデータを処理する主なオーケストレーション・ロジック

以下のタスクで定義されたワークフロー全体を管理するコア機能を定義する：

JSONデータセットを読み込む。
2人のエキスパートを初期化する。
ニュース項目を繰り返し見る。
選ばれた専門家にルーティングする。
結果を収集する。

以下のコードで実行できる：

def process_news_json_with_moe(json_filepath):
    # Open and load news items from the JSON file
    with open(json_filepath, "r", encoding="utf-8") as f:
        news_items = json.load(f)

    # Create a dictionary to hold instances of expert classes
    experts_registry = {
        "summarizer": NewsSummarizationLLMExpert(),
        "sentiment_analyzer": SentimentAnalysisLLMExpert()
    }

    # List to store the analysis results
    all_results = []

    # Iterate through each news item in the loaded data
    for i, news_item in enumerate(news_items):
        print(f"n--- Processing Article {i+1}/{len(news_items)} ---")
        # Extract relevant data from the news item
        id = news_item.get("id")
        headline = news_item.get("headline")
        content = news_item.get("content")
        url = news_item.get("url")

        # Print progress
        print(f"ID: {id}, Headline: {headline[:70]}...")

        # Use the gating network to determine the expert to use
        active_experts = route_to_experts(news_item, experts_registry)

        # Prepare a dictionary to store the analysis results
        news_item_analysis_results = {
            "id": id,
            "headline": headline,
            "url": url,
            "analyses": {}
        }

        # Iterate through the experts and apply their analysis
        for expert_instance in active_experts:
            expert_name = expert_instance.__class__.__name__ # Get the class name of the expert
            try:
                # Call the expert's analyze method
                analysis_result = expert_instance.analyze(article_content=content, article_headline=headline)
                # Store the result under the expert's name
                news_item_analysis_results["analyses"][expert_name] = analysis_result

            except Exception as e:
                # Handle any errors during analysis by a specific expert
                print(f"Error during analysis with {expert_name}: {e}")
                news_item_analysis_results["analyses"][expert_name] = { "error": str(e) }

        # Add the current item's results to the overall list
        all_results.append(news_item_analysis_results)

    return all_results

このスニペットでは

forループは、読み込まれたすべてのニュースを繰り返し処理する。
try-exceptブロックは解析を行い、発生しうるエラーを管理する。この場合、発生する可能性のあるエラーは、主に前の関数で定義したパラメータmax_lengthとmax_chars_for_sentimentに起因するものです。検索されるコンテンツはすべて同じ長さではないので、エラー管理は例外を効果的に処理するための基本です。

さあ、始めよう！あなたはプロセス全体のオーケストレーション機能を定義した。

ステップ6：処理機能の起動

スクリプトの最後の部分として、メイン処理関数を実行し、分析結果を以下のように出力JSONファイルに保存する必要がある：

# Call the main processing function with the input JSON file
final_analyses = process_news_json_with_moe(JSON_FILE)

print("nn--- MoE Analysis Complete ---")

# Write the final analysis results to a new JSON file
with open("analyzed_news_data.json", "w", encoding="utf-8") as f_out:
    json.dump(final_analyses, f_out, indent=4, ensure_ascii=False)

上記のコードでは

final_analyses変数は、MoEでデータを処理する関数を呼び出す。
分析されたデータはanalyzed_news_data.json出力ファイルに格納される。

出来上がり！スクリプト全体が完成し、データが分析され、保存される。

ステップ#7：すべてをまとめてコードを実行する

以下はmoe_analysis.pyファイルが含むべき内容です：

import json
from transformers import pipeline

# Define the input JSON file
JSON_FILE = "news-data.json"
# Specify the model for generating summaries
SUMMARIZATION_MODEL = "sshleifer/distilbart-cnn-6-6"
# Specify the model for analyzing sentiment
SENTIMENT_MODEL = "distilbert-base-uncased-finetuned-sst-2-english"

# Define a class representing an expert for news summarization
class NewsSummarizationLLMExpert:
    def __init__(self, model_name=SUMMARIZATION_MODEL):
        self.model_name = model_name
        self.summarizer = None

        # Initialize the summarization pipeline
        self.summarizer = pipeline(
            "summarization",
            model=self.model_name,
            tokenizer=self.model_name,
        )

    def analyze(self, article_content, article_headline=""):
        # Call the summarizer pipeline with the article content
        summary_outputs = self.summarizer(
            article_content,
            max_length=300,
            min_length=30,
            do_sample=False
        )
        # Extract the summary text from the pipeline's output
        summary = summary_outputs[0]["summary_text"]
        return { "summary": summary }


# Define a class representing an expert for sentiment analysis
class SentimentAnalysisLLMExpert:
    def __init__(self, model_name=SENTIMENT_MODEL):
        self.model_name = model_name
        self.sentiment_analyzer = None

        # Initialize the sentiment analysis pipeline
        self.sentiment_analyzer = pipeline(
            "sentiment-analysis",
            model=self.model_name,
            tokenizer=self.model_name, 
        )


    def analyze(self, article_content, article_headline=""):
        # Define max tokens
        max_chars_for_sentiment = 2000
        # Truncate the content if it exceeds the maximum limit
        truncated_content = article_content[:max_chars_for_sentiment]
        # Call the sentiment analyzer pipeline
        sentiment_outputs = self.sentiment_analyzer(truncated_content)
        # Extract the sentiment label
        label = sentiment_outputs[0]["label"]
        # Extract the sentiment score
        score = sentiment_outputs[0]["score"]
        return { "sentiment_label": label, "sentiment_score": score }


# Define a gating network
def route_to_experts(item_data, experts_registry):
    chosen_experts = []
    # Select the summarizer and sentiment analyzer
    chosen_experts.append(experts_registry["summarizer"])
    chosen_experts.append(experts_registry["sentiment_analyzer"])
    return chosen_experts


# Main function to manage the orchestration process
def process_news_json_with_moe(json_filepath):
    # Open and load news items from the JSON file
    with open(json_filepath, "r", encoding="utf-8") as f:
        news_items = json.load(f)

    # Create a dictionary to hold instances of expert classes
    experts_registry = {
        "summarizer": NewsSummarizationLLMExpert(),
        "sentiment_analyzer": SentimentAnalysisLLMExpert()
    }

    # List to store the analysis results
    all_results = []

    # Iterate through each news item in the loaded data
    for i, news_item in enumerate(news_items):
        print(f"n--- Processing Article {i+1}/{len(news_items)} ---")
        # Extract relevant data from the news item
        id = news_item.get("id")
        headline = news_item.get("headline")
        content = news_item.get("content")
        url = news_item.get("url")

        # Print progress
        print(f"ID: {id}, Headline: {headline[:70]}...")

        # Use the gating network to determine the expert to use
        active_experts = route_to_experts(news_item, experts_registry)

        # Prepare a dictionary to store the analysis results
        news_item_analysis_results = {
            "id": id,
            "headline": headline,
            "url": url,
            "analyses": {}
        }

        # Iterate through the experts and apply their analysis
        for expert_instance in active_experts:
            expert_name = expert_instance.__class__.__name__ # Get the class name of the expert
            try:
                # Call the expert's analyze method
                analysis_result = expert_instance.analyze(article_content=content, article_headline=headline)
                # Store the result under the expert's name
                news_item_analysis_results["analyses"][expert_name] = analysis_result

            except Exception as e:
                # Handle any errors during analysis by a specific expert
                print(f"Error during analysis with {expert_name}: {e}")
                news_item_analysis_results["analyses"][expert_name] = { "error": str(e) }

        # Add the current item's results to the overall list
        all_results.append(news_item_analysis_results)

    return all_results

# Call the main processing function with the input JSON file
final_analyses = process_news_json_with_moe(JSON_FILE)

print("nn--- MoE Analysis Complete ---")

# Write the final analysis results to a new JSON file
with open("analyzed_news_data.json", "w", encoding="utf-8") as f_out:
    json.dump(final_analyses, f_out, indent=4, ensure_ascii=False)

素晴らしい！約130行のコードで、あなたは最初のMoEプロジェクトを完成させました。

以下のコマンドでコードを実行する：

python moe_analysis.py

ターミナルの出力にはこう書かれているはずだ：

# Omitted for brevity...

--- Processing Article 6/10 ---
ID: cdrgdm4ye53o, Headline: Japanese Grand Prix: Lewis Hamilton says he has 'absolute 100% faith' ...

--- Processing Article 7/10 ---
ID: czed4jk7eeeo, Headline: F1 engines: A return to V10 or hybrid - what's the future?...
Error during analysis with NewsSummarizationLLMExpert: index out of range in self

--- Processing Article 8/10 ---
ID: cy700xne614o, Headline: Monte Carlo Masters: Novak Djokovic beaten as wait for 100th title con...
Error during analysis with NewsSummarizationLLMExpert: index out of range in self

# Omitted for brevity...

--- MoE Analysis Complete ---

実行が完了すると、analyzed_news_data.json出力ファイルがプロジェクト・フォルダーに現れます。そのファイルを開き、ニュース・アイテムの1つにフォーカスしてください。分析フィールドには、2人の専門家によって作成された要約とセンチメント分析結果が含まれます：

おわかりのように、MoEのアプローチにはそれがある：

記事の内容を要約し、要約で報告。
信頼度0.99のポジティブなセンチメントを定義した。

ミッション完了！

結論

この記事では、ステップ・バイ・ステップのセクションを通じて、MoEとそれを実際のシナリオに実装する方法について学んだ。

より多くのMoEシナリオを探求し、そのために新鮮なデータが必要な場合、Bright Dataは、スクレイピングの障害を克服しながら、ウェブページから更新されたリアルタイムのデータを取得するように設計された一連の強力なツールとサービスを提供します。

これらのソリューションには以下が含まれる：

ウェブアンロッカー：アンチスクレイピング保護をバイパスし、最小限の労力であらゆるウェブページからクリーンなHTMLを提供するAPI。
スクレイピング・ブラウザ：JavaScriptレンダリングを備えたクラウドベースの制御可能なブラウザ。CAPTCHA、ブラウザフィンガープリント、リトライなどを自動的に処理します。
ウェブスクレーパーAPI：数十の一般的なドメインから構造化されたウェブデータにプログラムでアクセスするためのエンドポイント。

その他の機械学習シナリオについては、当社のAIハブもご覧ください。

今すぐBright Dataに登録し、無料トライアルを開始してスクレイピングソリューションをお試しください！

無料トライアル Google で始める

MoEとは何か？人気のAIアーキテクチャを深く掘り下げる