Summary

summary

MODULE	DESCRIPTION
`main`	Main module of the application.
`src`	Source code of the summary containing core components and utilities.

main

Main module of the application.

This module serves as the entry point for the program. It imports necessary modules, sets up any initial configuration or data structures, and possibly defines main functions or classes that are used throughout the application.

src

Source code of the summary containing core components and utilities.

MODULE	DESCRIPTION
`app`	Initializes the app.
`endpoints`	Defines all endpoints of the FastAPI app.
`input_handling`	Handles logic of file and text summary endpoints.
`models`	Models loading and checking API and configuration parameters.
`settings`	Loads all settings from a central place, not hidden in utils.
`summarizing`	Implementation of the core logic of the summary.
`utils`	Utils functions for logging, LLM availability check, LLM authentication and configuration processing.

app

Initializes the app.

FUNCTION	DESCRIPTION
`lifespan`	Sets up a scheduler and updates available llms.

lifespan `async`

lifespan(_app)

Sets up a scheduler and updates available llms.

This lifespan function is started on startup of FastAPI. The first part - till yield is executed on startup and initializes a scheduler to regulary check the LLM-API. The second part is executed on shutdown and is used to clean up the scheduler.

The available LLMs - i.e. the LLMs where API-checks passed - are cached in FastAPI state object as app.state.available_llms.

Source code in docs/microservices/summary/src/app.py

@asynccontextmanager
async def lifespan(_app: FastAPI) -> AsyncGenerator[None, None]:
    """Sets up a scheduler and updates available llms.

    This lifespan function is started on startup of FastAPI. The first part
    - till `yield` is executed on startup and initializes a scheduler to regulary
    check the LLM-API. The second part is executed on shutdown and is used to
    clean up the scheduler.

    The available LLMs - i.e. the LLMs where API-checks passed - are cached in
    FastAPI state object as `app.state.available_llms`.
    """

    async def update_llm_state() -> None:
        _app.state.available_llms = await get_available_llms()

    # store available LLMs in FastAPI app state
    _app.state.available_llms = await get_available_llms()

    # setup a scheduler
    scheduler = AsyncIOScheduler()
    scheduler.add_job(
        update_llm_state,
        "interval",
        seconds=settings.check_llm_api_interval_in_s,
    )
    scheduler.start()

    yield

    # cleanup
    scheduler.shutdown()

endpoints

Defines all endpoints of the FastAPI app.

FUNCTION	DESCRIPTION
`get_llms`	Returns model information of available LLMs.
`health`	Performs a health check of the summary service.
`summarize_file`	Summarizes a pdf, docx or txt file including its parsing, cleaning and chunking.
`summarize_text`	Summarizes text input including basic text cleaning and chunking of text input.

get_llms `async`

get_llms(request)

Returns model information of available LLMs.

PARAMETER	DESCRIPTION
`request`	Request-Data. TYPE: `Request`

RETURNS	DESCRIPTION
`list[dict[str, Any]]`	List with information for each LLM.

Source code in docs/microservices/summary/src/endpoints.py

@router.get(
    "/llms",
    summary="List available language models.",
    description=("Returns a list of available language models (LLMs).\n\n"),
    responses={
        200: {
            "description": "List of available LLMs.",
            "content": {
                "application/json": {
                    "example": [
                        {
                            "label": "test_model:mock",
                            "is_remote": False,
                            "name": "test_model_mock",
                        },
                    ]
                }
            },
        },
        500: {"description": "Internal server error accessing microservice"},
    },
)
async def get_llms(request: Request) -> list[dict[str, Any]]:
    """Returns model information of available LLMs.

    Args:
        request (Request): Request-Data.

    Returns:
        List with information for each LLM.
    """
    app = request.app  # indirectly access the FastAPI app object
    return app.state.available_llms

health `async`

health()

Performs a health check of the summary service.

RETURNS	DESCRIPTION
`dict[str, str]`	Health check message as a dictionary.

Source code in docs/microservices/summary/src/endpoints.py

@router.get(
    "/",
    summary="Health check endpoint",
    description=(
        "Returns a simple message indicating that the summary service is running.\n\n"
        "Use this endpoint to verify that the service is alive and responsive."
    ),
    responses={
        200: {
            "description": "Health check successful",
            "content": {
                "application/json": {"example": {"status": "Summary is running"}}
            },
        },
        500: {"description": "Internal server error"},
    },
)
@router.get(
    "/health",
    summary="Health check endpoint",
    description=(
        "Returns a simple message indicating that the summary service is running.\n\n"
        "Use this endpoint to verify that the service is alive and responsive."
    ),
    responses={
        200: {
            "description": "Health check successful",
            "content": {
                "application/json": {"example": {"status": "Summary is running"}}
            },
        },
        500: {"description": "Internal server error"},
    },
)
async def health() -> dict[str, str]:
    """Performs a health check of the summary service.

    Returns:
        Health check message as a dictionary.
    """
    return {"message": f"{settings.service_name} is running"}

summarize_file `async`

summarize_file(file=File(..., description='Upload a PDF, DOCX, or TXT file.'), api_input=Depends(SummaryFileAPIInputParameters.as_form))

Summarizes a pdf, docx or txt file including its parsing, cleaning and chunking.

PARAMETER	DESCRIPTION
`file`	Input file either pdf, docx or txt. TYPE: `UploadFile` DEFAULT: `File(..., description='Upload a PDF, DOCX, or TXT file.')`
`api_input`	Containing the name of the language model, desired length of summary output as number of DIN-A4 pages, topics on which the summary should be focussed. TYPE: `SummaryFileAPIInputParameters` DEFAULT: `Depends(as_form)`

RETURNS	DESCRIPTION
`SummaryAPIOutput`	The summary, a message to the user and the parsing output.

Source code in docs/microservices/summary/src/endpoints.py

@router.post(
    "/summary/file",
    response_model=SummaryAPIOutput,
    summary="File summary endpoint.",
    description=(
        "Generates a summary for a PDF, DOCX, or TXT file.\n\n"
        "The endpoint parses, cleans, chunks the input file and then summarizes the text "
        "according to the requested output length, focusing the summary on the user-defined topics "
        "and using the specified language model."
    ),
    openapi_extra={
        "requestBody": {
            "content": {
                "application/json": {
                    "examples": SummaryFileAPIInputParameters.model_config[
                        "json_schema_extra"
                    ]["openapi_examples"],
                }
            }
        }
    },
    responses={
        200: {
            "description": "Successful summary generation.",
            "content": {
                "application/json": {
                    "examples": SummaryAPIOutput.model_config["json_schema_extra"][
                        "openapi_examples"
                    ],
                }
            },
        },
        400: {"description": "Invalid request."},
        424: {"description": "Failed dependency."},
        408: {"description": "Request timeout of a dependency."},
    },
)
async def summarize_file(
    file: UploadFile = File(..., description="Upload a PDF, DOCX, or TXT file."),
    api_input: SummaryFileAPIInputParameters = Depends(
        SummaryFileAPIInputParameters.as_form
    ),
) -> SummaryAPIOutput:
    """Summarizes a pdf, docx or txt file including its parsing, cleaning and chunking.

    Args:
        file (UploadFile): Input file either pdf, docx or txt.
        api_input (SummaryFileAPIInputParameters): Containing the name of the language model, desired length of
            summary output as number of DIN-A4 pages, topics on which the summary should be focussed.

    Returns:
        The summary, a message to the user and the parsing output.
    """
    summary_output = await request_handling(
        endpoint="file",
        parsing_input=file,
        language_model=api_input.language_model,
        output_length=api_input.output_length,
        topics=api_input.topics,
    )
    return summary_output

summarize_text `async`

summarize_text(api_input)

Summarizes text input including basic text cleaning and chunking of text input.

PARAMETER	DESCRIPTION
`api_input`	Containing the input text, name of the language model, desired length of summary output as number of DIN-A4 pages, topics on which the summary should be focussed. TYPE: `SummaryTextAPIInput`

RETURNS	DESCRIPTION
`SummaryAPIOutput`	The summary, a message to the user and the parsing output.

Source code in docs/microservices/summary/src/endpoints.py

@router.post(
    "/summary/text",
    response_model=SummaryAPIOutput,
    summary="Text summary endpoint.",
    description=(
        "Generates a summary from a plain text input.\n\n"
        "The endpoint performs basic text cleaning, chunking, and then summarizes the text "
        "according to the requested output length, focusing the summary on the user-defined topics "
        "and using the specified language model."
    ),
    openapi_extra={
        "requestBody": {
            "content": {
                "application/json": {
                    "examples": SummaryTextAPIInput.model_config["json_schema_extra"][
                        "openapi_examples"
                    ],
                }
            }
        }
    },
    responses={
        200: {
            "description": "Successful summary generation.",
            "content": {
                "application/json": {
                    "examples": SummaryAPIOutput.model_config["json_schema_extra"][
                        "openapi_examples"
                    ],
                }
            },
        },
        400: {"description": "Invalid request."},
        424: {"description": "Failed dependency."},
        408: {"description": "Request timeout of a dependency."},
    },
)
async def summarize_text(
    api_input: SummaryTextAPIInput,
) -> SummaryAPIOutput:
    """Summarizes text input including basic text cleaning and chunking of text input.

    Args:
        api_input (SummaryTextAPIInput): Containing the input text, name of the language model, desired length of
            summary output as number of DIN-A4 pages, topics on which the summary should be focussed.

    Returns:
        The summary, a message to the user and the parsing output.
    """
    summary_output = await request_handling(
        endpoint="text",
        parsing_input=api_input.text,
        language_model=api_input.language_model,
        output_length=api_input.output_length,
        topics=api_input.topics,
    )
    return summary_output

input_handling

Handles logic of file and text summary endpoints.

FUNCTION	DESCRIPTION
`parsing_request`	Performs error handling for the request to the parsing micro-service.
`request_handling`	This function calls the necessary functions to parse, chunk and summarize a file or text.

parsing_request `async`

parsing_request(endpoint, parsing_input, chunking_input)

Performs error handling for the request to the parsing micro-service.

This includes 3 retries in case the parsing micro-service is not available.

PARAMETER	DESCRIPTION
`endpoint`	Either "file" or "text" for endpoint selection. TYPE: `str`
`parsing_input`	Text or file that should be parsed. TYPE: `UploadFile \| str`
`chunking_input`	Settings for chunking, which includes chunking mode and two input parameters max_llm_input_chars and min_number_of_chunks. The chunksize will be computed from these input parameters. TYPE: `dict`

RETURNS	DESCRIPTION
`ParsingOutput`	Containing the cleaned and parsed text, chunks and a message to the user.

RAISES	DESCRIPTION
`HTTPException`	HTTP_424_FAILED_DEPENDENCY if the parsing does not finish within an acceptable time.
`HTTPException`	HTTP_400_BAD_REQUEST if the parser gets some input that can not be parsed.

Source code in docs/microservices/summary/src/input_handling.py

async def parsing_request(
    endpoint: str, parsing_input: UploadFile | str, chunking_input: dict
) -> ParsingOutput:
    """Performs error handling for the request to the parsing micro-service.

    This includes 3 retries in case the parsing micro-service is not available.

    Args:
        endpoint (str): Either "file" or "text" for endpoint selection.
        parsing_input (UploadFile | str): Text or file that should be parsed.
        chunking_input (dict): Settings for chunking, which includes chunking mode and two input parameters
            max_llm_input_chars and min_number_of_chunks. The chunksize will be computed from these input parameters.

    Returns:
        Containing the cleaned and parsed text, chunks and a message to the user.

    Raises:
        HTTPException: HTTP_424_FAILED_DEPENDENCY if the parsing does not finish within an acceptable time.
        HTTPException: HTTP_400_BAD_REQUEST if the parser gets some input that can not be parsed.
    """
    parsing_output = None
    config = settings.inter_service_communication.parser
    logger.debug(f"Communication with Parser is configured using {config}")

    for attempt_counter in range(config.max_attempts):
        if attempt_counter > 0:
            logger.warning(
                f"Retrying request to parser "
                f"({attempt_counter} / {config.max_attempts - 1})"
            )
        try:
            response = await _request_parsing_output(
                timeout_in_s=config.timeout_in_s,
                endpoint=endpoint,
                parsing_input=parsing_input,
                chunking_input=chunking_input,
            )

            if response.status_code == httpx.codes.OK:
                logger.debug(f"Response from Parser: {response.json()}.")
                parsing_output = ParsingOutput(**response.json())
                parsed_text_length = len(parsing_output.text)
                if parsed_text_length == 0 or parsing_output.chunks is None:
                    logger.critical(
                        "Parsing failed because of incomplete parsing output (with length parsed text = "
                        f"{parsed_text_length} and list of chunks of type = {type(parsing_output.chunks)}). "
                        "Please check the compatibilty of the summary und parsing microservice versions!"
                    )
                    raise HTTPException(
                        status_code=status.HTTP_424_FAILED_DEPENDENCY,
                        detail=(
                            "Der Text konnte nicht verarbeitet werden."
                            "Bitte wenden Sie sich an Ihren technischen Support."
                        ),
                    )
            else:
                logger.critical(
                    f"Parsing failed with status code {response.status_code}."
                )
                status_code = status.HTTP_424_FAILED_DEPENDENCY
                error_msg = (
                    "Der Text konnte aufgrund eines unvorhergesehenen Fehlers nicht verarbeitet werden. "
                    "Bitte versuchen Sie es mit einer anderen Datei."
                )

                response_payload = response.json()
                if "detail" in response_payload:
                    error_msg = response_payload["detail"]

                if response.status_code in (
                    httpx.codes.BAD_REQUEST,
                    httpx.codes.UNPROCESSABLE_ENTITY,
                ):
                    status_code = response.status_code

                raise HTTPException(
                    status_code=status_code,
                    detail=error_msg,
                )

        except httpx.TimeoutException:
            logger.error(
                "Parsing microservice could not return a parsing result within the accepted time of "
                f"{config.timeout_in_s}. Therefore the text could not be summarized!"
            )
            raise HTTPException(
                status_code=status.HTTP_424_FAILED_DEPENDENCY,
                detail=(
                    "Der Text konnte nicht in angemessener Zeit verarbeitet werden. "
                    "Bitte versuchen Sie es mit einer kleineren Datei."
                ),
            )

        except httpx.RequestError as e:
            if attempt_counter < config.max_attempts - 1:
                logger.warning("Could not connect to parser endpoint.")
                await asyncio.sleep(3)
            else:
                logger.critical(
                    f"Could not connect to parser endpoint. Giving up after maximal number of retries. Error: {e}."
                )
                raise HTTPException(
                    status_code=status.HTTP_424_FAILED_DEPENDENCY,
                    detail="Aus technischen Gründen kann der Text vorübergehend nicht verarbeitet werden. "
                    "Bitte versuchen Sie es später erneut.",
                )

    return parsing_output

request_handling `async`

request_handling(endpoint, parsing_input, language_model, output_length, topics)

This function calls the necessary functions to parse, chunk and summarize a file or text.

This also includes the computation of parameters, which are required to chunk and summarize the parsed text, using the parsing output and user input.

PARAMETER	DESCRIPTION
`endpoint`	Either "file" or "text" for endpoint selection. TYPE: `str`
`parsing_input`	Text or file that should be parsed. TYPE: `UploadFile \| str`
`language_model`	Name of the language model. TYPE: `str`
`output_length`	Desired length of summary output as number of DIN-A4 pages. TYPE: `float`
`topics`	User input text stating the topics the summary should focus on. TYPE: `str \| None`

RETURNS	DESCRIPTION
`SummaryAPIOutput`	Contains the summary, a message to the user and the parsing output.

Source code in docs/microservices/summary/src/input_handling.py

async def request_handling(
    endpoint: str,
    parsing_input: UploadFile | str,
    language_model: str,
    output_length: float,
    topics: str | None,
) -> SummaryAPIOutput:
    """This function calls the necessary functions to parse, chunk and summarize a file or text.

    This also includes the computation of parameters, which are required to chunk and summarize the parsed text,
    using the parsing output and user input.

    Args:
        endpoint (str): Either "file" or "text" for endpoint selection.
        parsing_input (UploadFile | str): Text or file that should be parsed.
        language_model (str): Name of the language model.
        output_length (float): Desired length of summary output as number of DIN-A4 pages.
        topics (str | None): User input text stating the topics the summary should focus on.

    Returns:
        Contains the summary, a message to the user and the parsing output.
    """
    start_request_timer = time.time()
    warning_msg = []

    # Parsing
    max_input_chars_estimate = summary_registry.estimate_max_input_chars(
        language_model=language_model
    )
    chunking_input = {
        "mode": "summary_chunking",
        "min_number_of_chunks": 5,
        "max_llm_input_chars": max_input_chars_estimate["max_input_chars_estimate"],
    }
    start_parsing_timer = time.time()
    parsing_output = await parsing_request(
        endpoint=endpoint,
        parsing_input=parsing_input,
        chunking_input=chunking_input,
    )
    logger.debug(
        f"Whole text parsing, cleaning and chunking took {int(time.time() - start_parsing_timer)} seconds."
    )
    warning_msg.append(parsing_output.warning_msg)
    text = parsing_output.text

    chunks = _transform_chunks_to_langchaindocs(chunks=parsing_output.chunks)
    summary_parameters = await summary_registry.configure_summary_parameters(
        warning_msg=warning_msg,
        language_model=language_model,
        remaining_context_length=max_input_chars_estimate["remaining_context_length"],
        desired_summary_length={
            "input_length": len(text),
            "output_length": output_length,
        },
        topics=topics,
    )

    start_summary_timer = time.time()
    summary_output = await summary_registry.request_summary(
        summarize_input=SummarizeInput(
            language_model=language_model,
            chunks=chunks,
            summary_parameters=summary_parameters,
        ),
        text=text,
    )

    end_summary_timer = time.time()
    logger.debug(
        f"The summarization took {int(end_summary_timer - start_summary_timer)} seconds."
    )
    logger.debug(
        f"Whole summary request incl. parsing and chunking took {int(end_summary_timer - start_request_timer)}"
        " seconds."
    )

    return summary_output

models

Models loading and checking API and configuration parameters.

MODULE	DESCRIPTION
`api_input`	Pydantic Models for API input parameters.
`api_output`	Pydantic Models for API ouput parameters.
`general`	Defines settings and thereby assigns default values.
`graph_state`	Defines states within the LangGraph performing the summary.
`llms`	Pydantic Models describing an LLM used for LLM config.
`parser_output`	Defines the output of the parsing microservice.
`summary_parameters`	Pydantic Models for parameters neccessary for the summarization.

api_input

Pydantic Models for API input parameters.

CLASS	DESCRIPTION
`SummaryFileAPIInputParameters`	Model defining the input parameters as part of a valid Summary request for file endpoint.
`SummaryTextAPIInput`	Model defining the input of a valid Summary request for text endpoint.

SummaryFileAPIInputParameters

Bases: BaseModel

Model defining the input parameters as part of a valid Summary request for file endpoint.

Due to technical reasons "file (Uploadfile)" can not be part of this pydantic model

even though it is part of the file endpoints input.

ATTRIBUTE	DESCRIPTION
`language_model`	The name or identifier of the language model to use. TYPE: `str`
`output_length`	Desired length of summary output as number of DIN-A4 pages. Default is 0, which will lead to no summary length restrictions. This is the fastest option since the LLM will decide, which length is the most suitable one. TYPE: `NonNegativeFloat`
`topics`	User input text stating the topics the summary should focus on. Default is None, which will lead to a general summary without any focus topics. TYPE: `str \| None`

METHOD	DESCRIPTION
`as_form`	Creates an instance of `SummaryFileAPIInputParameters` from multipart form-data.

Source code in docs/microservices/summary/src/models/api_input.py

class SummaryFileAPIInputParameters(BaseModel):
    """Model defining the input parameters as part of a valid Summary request for file endpoint.

    Note: Due to technical reasons "file (Uploadfile)" can not be part of this pydantic model
          even though it is part of the file endpoints input.

    Attributes:
        language_model (str): The name or identifier of the language model to use.
        output_length (NonNegativeFloat, optional): Desired length of summary output as number of DIN-A4 pages.
            Default is 0, which will lead to no summary length restrictions. This is the fastest option since
            the LLM will decide, which length is the most suitable one.
        topics (str | None): User input text stating the topics the summary should focus on.
            Default is None, which will lead to a general summary without any focus topics.
    """

    language_model: str
    output_length: NonNegativeFloat = 0
    topics: str | None = None

    model_config = ConfigDict(
        json_schema_extra={
            "openapi_examples": {
                "simple_summary": {
                    "summary": "Simple summary request",
                    "description": (
                        "Example input for a summary without predefined output length or focus topic settings."
                    ),
                    "value": {
                        "file": "tests/data/txt-testfile.txt",
                        "language_model": "test_model_mock",
                    },
                },
                "fast_summary": {
                    "summary": "Fast summary request",
                    "description": (
                        "Example input for a fast summary. Using a txt file without predefined output length but "
                        "with focus topic setting."
                    ),
                    "value": {
                        "file": "tests/data/txt-testfile.txt",
                        "language_model": "test_model_mock",
                        "topics": "Städte",
                    },
                },
                "output_length_summary": {
                    "summary": "Summary request with output length",
                    "description": "Example input for a summary with predefined output length.",
                    "value": {
                        "file": "tests/data/txt-testfile.txt",
                        "language_model": "test_model_mock",
                        "output_length": 2.5,
                    },
                },
                "focus_topic_summary": {
                    "summary": "Summary request with focus topic",
                    "description": "Example input for a summary with focus on a specific topic.",
                    "value": {
                        "file": "tests/data/pdf-testfile.pdf",
                        "language_model": "test_model_mock",
                        "topics": "Open Source, Community-Gedanke",
                    },
                },
            }
        }
    )

    @classmethod
    def as_form(
        cls,
        language_model: str = Form(
            ...,
            description="The name or identifier of the language model to use.",
            example="test_model_mock",
        ),
        output_length: float = Form(
            0,
            description=(
                "Desired summary length in DIN-A4 pages. "
                "Default 0 = no restriction and high performance."
            ),
            example=3.0,
            ge=0,
        ),
        topics: str | None = Form(
            None,
            description=(
                "Comma-separated topics the summary should focus on. "
                "Default None = general summary without any focus."
            ),
            example="public administration, artificial intelligence, digitization",
        ),
    ) -> Self:
        """Creates an instance of `SummaryFileAPIInputParameters` from multipart form-data.

        FastAPI does not automatically map Pydantic models from form-data in
        multipart requests. This helper enables the model to be used together
        with file uploads by defining how form parameters should be parsed.

        Args:
            language_model (str): Selected language model.
            output_length (float): Desired summary length in pages.
            topics (str | None): User-defined focus topics.

        Returns:
            A validated input parameter set.
        """
        return cls(
            language_model=language_model,
            output_length=output_length,
            topics=topics,
        )

as_form classmethod

as_form(language_model=Form(..., description='The name or identifier of the language model to use.', example='test_model_mock'), output_length=Form(0, description='Desired summary length in DIN-A4 pages. Default 0 = no restriction and high performance.', example=3.0, ge=0), topics=Form(None, description='Comma-separated topics the summary should focus on. Default None = general summary without any focus.', example='public administration, artificial intelligence, digitization'))

Creates an instance of SummaryFileAPIInputParameters from multipart form-data.

FastAPI does not automatically map Pydantic models from form-data in multipart requests. This helper enables the model to be used together with file uploads by defining how form parameters should be parsed.

PARAMETER	DESCRIPTION
`language_model`	Selected language model. TYPE: `str` DEFAULT: `Form(..., description='The name or identifier of the language model to use.', example='test_model_mock')`
`output_length`	Desired summary length in pages. TYPE: `float` DEFAULT: `Form(0, description='Desired summary length in DIN-A4 pages. Default 0 = no restriction and high performance.', example=3.0, ge=0)`
`topics`	User-defined focus topics. TYPE: `str \| None` DEFAULT: `Form(None, description='Comma-separated topics the summary should focus on. Default None = general summary without any focus.', example='public administration, artificial intelligence, digitization')`

RETURNS	DESCRIPTION
`Self`	A validated input parameter set.

Source code in docs/microservices/summary/src/models/api_input.py

@classmethod
def as_form(
    cls,
    language_model: str = Form(
        ...,
        description="The name or identifier of the language model to use.",
        example="test_model_mock",
    ),
    output_length: float = Form(
        0,
        description=(
            "Desired summary length in DIN-A4 pages. "
            "Default 0 = no restriction and high performance."
        ),
        example=3.0,
        ge=0,
    ),
    topics: str | None = Form(
        None,
        description=(
            "Comma-separated topics the summary should focus on. "
            "Default None = general summary without any focus."
        ),
        example="public administration, artificial intelligence, digitization",
    ),
) -> Self:
    """Creates an instance of `SummaryFileAPIInputParameters` from multipart form-data.

    FastAPI does not automatically map Pydantic models from form-data in
    multipart requests. This helper enables the model to be used together
    with file uploads by defining how form parameters should be parsed.

    Args:
        language_model (str): Selected language model.
        output_length (float): Desired summary length in pages.
        topics (str | None): User-defined focus topics.

    Returns:
        A validated input parameter set.
    """
    return cls(
        language_model=language_model,
        output_length=output_length,
        topics=topics,
    )

SummaryTextAPIInput

Bases: BaseModel

Model defining the input of a valid Summary request for text endpoint.

ATTRIBUTE	DESCRIPTION
`text`	The text content to be summarized. TYPE: `str`
`language_model`	The name or identifier of the language model to use. TYPE: `str`
`output_length`	Desired length of summary output as number of DIN-A4 pages. Default is 0, which will lead to no summary length restrictions. This is the fastest option since the LLM will decide, which length is the most suitable one. TYPE: `float`
`topics`	User input text stating the topics the summary should focus on. Default is None, which will lead to a general summary without any focus topics. TYPE: `str \| None`

Source code in docs/microservices/summary/src/models/api_input.py

class SummaryTextAPIInput(BaseModel):
    """Model defining the input of a valid Summary request for text endpoint.

    Attributes:
        text (str): The text content to be summarized.
        language_model (str): The name or identifier of the language model to use.
        output_length (float, optional): Desired length of summary output as number of DIN-A4 pages.
            Default is 0, which will lead to no summary length restrictions. This is the fastest option since
            the LLM will decide, which length is the most suitable one.
        topics (str | None): User input text stating the topics the summary should focus on.
            Default is None, which will lead to a general summary without any focus topics.
    """

    text: str
    language_model: str
    output_length: float = Field(0, ge=0, strict=True)
    topics: str | None = None

    model_config = ConfigDict(
        json_schema_extra={
            "openapi_examples": {
                "simple_summary": {
                    "summary": "Simple summary request",
                    "description": (
                        "Example input for a summary without predefined output length or focus topic settings."
                    ),
                    "value": {
                        "text": (
                            "F13 ist eine vollständig souveräne und Modell-agnostische KI-Assistenz, die auf eigener "
                            "Infrastruktur betrieben werden kann und von der Verwaltung für die Verwaltung entwickelt "
                            "wurde. F13 stellt eine Vielzahl modularer, flexibel einsetzbarer KI-Funktionen bereit – "
                            "zum Verarbeiten von Informationen, Erstellen von Texten, oder zum Recherchieren. Nutzende "
                            "können mit F13 gezielt, wirksam und sicher Generative Künstliche Intelligenz bei "
                            "täglichen Büroarbeiten einsetzen. Mit dem Übergang in eine Open-Source-Weiterentwicklung "
                            "wird F13 seit Juli 2025 als gemeinschaftliches Vorhaben geöffnet: Der Quellcode steht "
                            "frei zur Verfügung, die Nutzung ist offen. Interessierte Verwaltungen, Organisationen "
                            "sowie Entwicklerinnen und Entwickler sind eingela-den, das System für eigene Bedarfe "
                            "anzupassen, weiterzuentwickeln und diese Entwicklungen wiederum Open Source "
                            "bereitzustellen."
                        ),
                        "language_model": "test_model_mock",
                    },
                },
                "fast_summary": {
                    "summary": "Fast summary request",
                    "description": (
                        "Example input for a fast summary without predefined output length but with focus topic "
                        "setting."
                    ),
                    "value": {
                        "text": (
                            "F13 ist eine vollständig souveräne und Modell-agnostische KI-Assistenz, die auf eigener "
                            "Infrastruktur betrieben werden kann und von der Verwaltung für die Verwaltung entwickelt "
                            "wurde. F13 stellt eine Vielzahl modularer, flexibel einsetzbarer KI-Funktionen bereit – "
                            "zum Verarbeiten von Informationen, Erstellen von Texten, oder zum Recherchieren. Nutzende "
                            "können mit F13 gezielt, wirksam und sicher Generative Künstliche Intelligenz bei "
                            "täglichen Büroarbeiten einsetzen. Mit dem Übergang in eine Open-Source-Weiterentwicklung "
                            "wird F13 seit Juli 2025 als gemeinschaftliches Vorhaben geöffnet: Der Quellcode steht "
                            "frei zur Verfügung, die Nutzung ist offen. Interessierte Verwaltungen, Organisationen "
                            "sowie Entwicklerinnen und Entwickler sind eingela-den, das System für eigene Bedarfe "
                            "anzupassen, weiterzuentwickeln und diese Entwicklungen wiederum Open Source "
                            "bereitzustellen."
                        ),
                        "language_model": "test_model_mock",
                        "topics": "KI, Papier",
                    },
                },
                "output_length_summary": {
                    "summary": "Summary request with output length",
                    "description": "Example input for a summary with predefined output length",
                    "value": {
                        "text": (
                            "F13 ist eine vollständig souveräne und Modell-agnostische KI-Assistenz, die auf eigener "
                            "Infrastruktur betrieben werden kann und von der Verwaltung für die Verwaltung entwickelt "
                            "wurde. F13 stellt eine Vielzahl modularer, flexibel einsetzbarer KI-Funktionen bereit – "
                            "zum Verarbeiten von Informationen, Erstellen von Texten, oder zum Recherchieren. Nutzende "
                            "können mit F13 gezielt, wirksam und sicher Generative Künstliche Intelligenz bei "
                            "täglichen Büroarbeiten einsetzen. Mit dem Übergang in eine Open-Source-Weiterentwicklung "
                            "wird F13 seit Juli 2025 als gemeinschaftliches Vorhaben geöffnet: Der Quellcode steht "
                            "frei zur Verfügung, die Nutzung ist offen. Interessierte Verwaltungen, Organisationen "
                            "sowie Entwicklerinnen und Entwickler sind eingela-den, das System für eigene Bedarfe "
                            "anzupassen, weiterzuentwickeln und diese Entwicklungen wiederum Open Source "
                            "bereitzustellen."
                        ),
                        "output_length": 2.5,
                        "language_model": "test_model_mock",
                    },
                },
                "focus_topic_summary": {
                    "summary": "Summary request with focus topic",
                    "description": "Example input for a summary with focus on a specific topic.",
                    "value": {
                        "text": (
                            "F13 ist eine vollständig souveräne und Modell-agnostische KI-Assistenz, die auf eigener "
                            "Infrastruktur betrieben werden kann und von der Verwaltung für die Verwaltung entwickelt "
                            "wurde. F13 stellt eine Vielzahl modularer, flexibel einsetzbarer KI-Funktionen bereit – "
                            "zum Verarbeiten von Informationen, Erstellen von Texten, oder zum Recherchieren. Nutzende "
                            "können mit F13 gezielt, wirksam und sicher Generative Künstliche Intelligenz bei "
                            "täglichen Büroarbeiten einsetzen. Mit dem Übergang in eine Open-Source-Weiterentwicklung "
                            "wird F13 seit Juli 2025 als gemeinschaftliches Vorhaben geöffnet: Der Quellcode steht "
                            "frei zur Verfügung, die Nutzung ist offen. Interessierte Verwaltungen, Organisationen "
                            "sowie Entwicklerinnen und Entwickler sind eingeladen, das System für eigene Bedarfe "
                            "anzupassen, weiterzuentwickeln und diese Entwicklungen wiederum Open Source "
                            "bereitzustellen."
                        ),
                        "topics": "Open Source, Community-Gedanke",
                        "language_model": "test_model_mock",
                    },
                },
            }
        }
    )

api_output

Pydantic Models for API ouput parameters.

CLASS	DESCRIPTION
`SummaryAPIOutput`	Summary response output of summary generation.

SummaryAPIOutput

Bases: BaseModel

Summary response output of summary generation.

ATTRIBUTE	DESCRIPTION
`summary`	The generated summary text. TYPE: `str`
`parsed_text`	The parsed and preprocessed source text used for summarization. TYPE: `str`
`warning_msg`	Optional warning message (e.g. about text length of the generated summary). TYPE: `str`

Source code in docs/microservices/summary/src/models/api_output.py

class SummaryAPIOutput(BaseModel):
    """Summary response output of summary generation.

    Attributes:
        summary (str): The generated summary text.
        parsed_text (str): The parsed and preprocessed source text used for summarization.
        warning_msg (str): Optional warning message (e.g. about text length of the generated summary).
    """

    summary: str
    parsed_text: str
    warning_msg: str

    model_config = ConfigDict(
        json_schema_extra={
            "openapi_examples": {
                "summary_output": {
                    "summary": "Summary output with warning",
                    "description": "Example showing a summary with a warning message.",
                    "value": {
                        "summary": "This is the generated summary of the document.",
                        "parsed_text": "Original source text preprocessed for summarization.",
                        "warning_msg": "This is a message to the user encompassing hints or warnings.",
                    },
                },
            }
        }
    )

general

Defines settings and thereby assigns default values.

CLASS	DESCRIPTION
`ActiveLLMs`	Selects the available models for the respective use cases.
`InterServiceCommunication`	Configuration of all microservice communications.
`LogLevel`	Specifies possible log levels using a enum class.
`PostConfig`	Configuration for async_post request to other microservices (e.g. parser).
`Settings`	Specifies general settings for the service.

ActiveLLMs

Bases: BaseModel

Selects the available models for the respective use cases.

ATTRIBUTE	DESCRIPTION
`model_config`	Used to ignore other services, which are defined in the config. TYPE: `ConfigDict`
`summary`	List the names of available LLMs for the summary service. TYPE: `List(str`

Source code in docs/microservices/summary/src/models/general.py

class ActiveLLMs(BaseModel):
    """Selects the available models for the respective use cases.

    Attributes:
        model_config (ConfigDict): Used to ignore other services, which are defined in the config.
        summary (List(str)): List the names of available LLMs for the summary service.
    """

    model_config = ConfigDict(extra="ignore")
    summary: list[str]

InterServiceCommunication

Bases: BaseModel

Configuration of all microservice communications.

PARAMETER	DESCRIPTION
`parser`	Default configuration for parsing microservice. TYPE: `PostConfig`

Source code in docs/microservices/summary/src/models/general.py

class InterServiceCommunication(BaseModel):
    """Configuration of all microservice communications.

    Args:
        parser (PostConfig): Default configuration for parsing microservice.
    """

    parser: PostConfig = PostConfig()

LogLevel

Bases: StrEnum

Specifies possible log levels using a enum class.

Source code in docs/microservices/summary/src/models/general.py

class LogLevel(StrEnum):
    """Specifies possible log levels using a enum class."""

    CRITICAL = "CRITICAL"
    ERROR = "ERROR"
    WARNING = "WARNING"
    INFO = "INFO"
    DEBUG = "DEBUG"

    @classmethod
    def _missing_(cls, value: object) -> None:
        """Converts strings to uppercase and recheck for existence."""
        if isinstance(value, str):
            value = value.upper()
            for level in cls:
                if level == value:
                    return level
        return None

PostConfig

Bases: BaseModel

Configuration for async_post request to other microservices (e.g. parser).

The default values in this class can be overwritten by those values stated in configs/general.yml.

ATTRIBUTE	DESCRIPTION
`model_config`	Used to ignore other services, which are defined in the config. TYPE: `ConfigDict`
`max_attempts`	Maximum number of request attempts before returning status code 424. TYPE: `PositiveInt`
`timeout_in_s`	Maximum waiting duration before timeout (in seconds). TYPE: `PositiveInt`

These values can be overwritten by using the following code-sniped in general.yml:

inter_service_communication: parser: max_attempts: 3 connection_timeout: 200

Source code in docs/microservices/summary/src/models/general.py

class PostConfig(BaseModel):
    """Configuration for async_post request to other microservices (e.g. parser).

    The default values in this class can be overwritten by those values stated in configs/general.yml.

    Attributes:
        model_config (ConfigDict): Used to ignore other services, which are defined in the config.
        max_attempts (PositiveInt): Maximum number of request attempts before returning status code 424.
        timeout_in_s (PositiveInt):  Maximum waiting duration before timeout (in seconds).

    Note:   These values can be overwritten by using the following code-sniped in general.yml:
            inter_service_communication:
              parser:
                max_attempts: 3
                connection_timeout: 200

    """

    model_config = ConfigDict(extra="ignore")
    max_attempts: PositiveInt = 3
    timeout_in_s: PositiveInt = 200

Settings

Bases: BaseModel

Specifies general settings for the service.

ATTRIBUTE	DESCRIPTION
`model_config`	Used to ignore other services, which are defined in the config. TYPE: `ConfigDict`
`service_name`	Name of service, i.e. 'summary'. TYPE: `str`
`service_endpoints`	URLs of required services (e.g. parser). TYPE: `dict[str, AnyHttpUrl]`
`active_llms`	Selection of available models for respective use cases. TYPE: `ActiveLLMs`
`log_level`	Minimal level of logging output given. TYPE: `LogLevel`
`log_file_max_bytes`	(PositiveInt): Max file size for logfile. TYPE: `PositiveInt`
`log_file_backup_count`	Number of log-files to loop over. TYPE: `PositiveInt`
`log_file`	Write logfile there. TYPE: `FilePath`
`check_llm_api_interval_in_s`	Interval for checking all LLM APIs (seconds). TYPE: `PositiveInt`
`n_uvicorn_workers`	Number of parallel uvicorn instances. TYPE: `PositiveInt`
`inter_service_communictaion`	Configuration of communication with other services. TYPE: `InterServiceCommunication`

METHOD	DESCRIPTION
`ensure_log_dir`	Creates the log directory after validation.

Source code in docs/microservices/summary/src/models/general.py

class Settings(BaseModel):
    """Specifies general settings for the service.

    Attributes:
        model_config (ConfigDict): Used to ignore other services, which are defined in the config.
        service_name (str): Name of service, i.e. 'summary'.
        service_endpoints (dict[str, AnyHttpUrl]): URLs of required services (e.g. parser).
        active_llms (ActiveLLMs): Selection of available models for respective use cases.
        log_level (LogLevel): Minimal level of logging output given.
        log_file_max_bytes: (PositiveInt): Max file size for logfile.
        log_file_backup_count (PositiveInt): Number of log-files to loop over.
        log_file (FilePath): Write logfile there.
        check_llm_api_interval_in_s (PositiveInt): Interval for checking all LLM APIs (seconds).
        n_uvicorn_workers (PositiveInt): Number of parallel uvicorn instances.
        inter_service_communictaion (InterServiceCommunication): Configuration of communication with other services.
    """

    model_config = ConfigDict(extra="ignore")

    service_name: str = "Summary"
    service_description: str = "Generation of summaries of files and text using LLMs."

    # number of parallel uvicorn instances
    n_uvicorn_workers: PositiveInt = 1

    active_llms: ActiveLLMs

    # interval for checking all LLM APIs (seconds)
    check_llm_api_interval_in_s: PositiveInt = 60

    service_endpoints: dict[str, AnyHttpUrl]

    log_level: LogLevel = LogLevel.INFO
    log_file_max_bytes: PositiveInt = 1 * 1024 * 1024
    log_file_backup_count: PositiveInt = 3
    log_file: FilePath = Path("/summary/logs/log")

    inter_service_communication: InterServiceCommunication = InterServiceCommunication()

    @model_validator(mode="after")
    def ensure_log_dir(self) -> "Settings":
        """Creates the log directory after validation."""
        self.log_file.parent.mkdir(parents=True, exist_ok=True)
        return self

ensure_log_dir

ensure_log_dir()

Creates the log directory after validation.

Source code in docs/microservices/summary/src/models/general.py

@model_validator(mode="after")
def ensure_log_dir(self) -> "Settings":
    """Creates the log directory after validation."""
    self.log_file.parent.mkdir(parents=True, exist_ok=True)
    return self

graph_state

Defines states within the LangGraph performing the summary.

CLASS	DESCRIPTION
`OverallState`	Defines the overall state of the LangGraph performing the summary.
`SummaryState`	Defines the status for the generation of individual summaries.

OverallState

Bases: TypedDict

Defines the overall state of the LangGraph performing the summary.

Contains contents and summaries of all chunks.

ATTRIBUTE	DESCRIPTION
`contents`	List containing the content for each chunk. TYPE: `list[str]`
`summaries`	List containing the summaries for each chunks content. TYPE: `list[str]`
`collapsed_summaries`	List containing the summaries of summaries. TYPE: `list[Document]`
`final_summary`	Final consolidated summary (part of summary output). TYPE: `str`
`messages`	List of messages to the user regarding the summary (part of summary output). TYPE: `list[str]`
`quit_reducing`	Tracks whether the REDUCE-Loop was successful. TYPE: `bool`
`num_reduce_call`	Counter of the REDUCE-calls for testing purposes. TYPE: `int`
`desired_summary_chars`	Desired number of chars for the final summary output (reduce-loop criterion). TYPE: `int`
`max_input_chars`	Maximal number of chars to generate final summary in one LLM-call (reduce criterion). TYPE: `int`
`focus_instructions_map`	MAP-prompt part instructing to summarize with focus on specified topics. TYPE: `str`
`focus_instructions_reduce`	REDUCE-Prompt part structuring the summary according to topics. TYPE: `str`
`focus_instructions_final`	FINAL-Prompt part structuring the summary according to topics and including a note for all topics, that are not covered by the text. TYPE: `str`

Source code in docs/microservices/summary/src/models/graph_state.py

class OverallState(TypedDict):
    """Defines the overall state of the LangGraph performing the summary.

    Contains contents and summaries of all chunks.

    Attributes:
        contents (list[str]): List containing the content for each chunk.
        summaries (list[str]): List containing the summaries for each chunks content.
        collapsed_summaries (list[Document]): List containing the summaries of summaries.
        final_summary (str): Final consolidated summary (part of summary output).

        messages (list[str]): List of messages to the user regarding the summary (part of summary output).

        quit_reducing (bool): Tracks whether the REDUCE-Loop was successful.
        num_reduce_call (int): Counter of the REDUCE-calls for testing purposes.

        desired_summary_chars (int): Desired number of chars for the final summary output (reduce-loop criterion).
        max_input_chars (int): Maximal number of chars to generate final summary in one LLM-call (reduce criterion).

        focus_instructions_map (str): MAP-prompt part instructing to summarize with focus on specified topics.
        focus_instructions_reduce (str): REDUCE-Prompt part structuring the summary according to topics.
        focus_instructions_final (str): FINAL-Prompt part structuring the summary according to topics and including a
                                         note for all topics, that are not covered by the text.
    """

    contents: list[str]
    summaries: Annotated[list, operator.add]
    collapsed_summaries: list[Document]
    final_summary: str

    quit_reducing: bool
    num_reduce_call: int

    messages: list[str]

    desired_summary_chars: int
    max_input_chars: int

    focus_instructions_map: str
    focus_instructions_reduce: str
    focus_instructions_final: str

SummaryState

Bases: TypedDict

Defines the status for the generation of individual summaries.

Used during the mapping process as input for generate_summary().

ATTRIBUTE	DESCRIPTION
`content`	Text section to be summarized. TYPE: `str`
`focus_instructions_map`	MAP-prompt part instructing to summarize with focus on specified topics. TYPE: `str`

Source code in docs/microservices/summary/src/models/graph_state.py

class SummaryState(TypedDict):
    """Defines the status for the generation of individual summaries.

    Used during the mapping process as input for generate_summary().

    Attributes:
        content (str): Text section to be summarized.
        focus_instructions_map (str): MAP-prompt part instructing to summarize with focus on specified topics.
    """

    content: str
    focus_instructions_map: str

llms

Pydantic Models describing an LLM used for LLM config.

CLASS	DESCRIPTION
`APIAuth`	Defines Authentification settings for LLM.
`LLM`	Defines the basic structure of a LLM config.
`LLMAPI`	Defines API-Connection to LLM.
`LLMConfig`	Defines the LLMs used for summarization.
`LLMInference`	Defines the inference parameters.
`LLMPromptComponents`	Defines the components of each prompt.
`LLMPromptConfig`	Defines the structure of a LLM prompt configuration.
`LLMPromptMaps`	Defines complete LLM prompt config.
`LLMPrompts`	Defines the prompts.

APIAuth

Bases: BaseModel

Defines Authentification settings for LLM.

ATTRIBUTE	DESCRIPTION
`type`	Either 'token' or 'basic_auth'. TYPE: `Literal`
`secret_path`	File path where the api token or credentials are stored. TYPE: `FilePath`

METHOD	DESCRIPTION
`get_auth_header`	Generate auth part of header for http request.

Source code in docs/microservices/summary/src/models/llms.py

class APIAuth(BaseModel):
    """Defines Authentification settings for LLM.

    Attributes:
        type (Literal): Either 'token' or 'basic_auth'.
        secret_path (FilePath): File path where the api token or credentials are stored.
    """

    type: Literal["token", "basic_auth"]
    secret_path: FilePath

    @property
    def secret(self) -> SecretStr:
        """Load secret variable as 'secret'."""
        with open(self.secret_path) as file:
            return SecretStr(file.read().strip())

    def get_auth_header(self) -> str:
        """Generate auth part of header for http request.

        Returns:
            str: Auth header.
        """
        auth_header = ""

        if self.type == "basic_auth":
            auth_header = f"Basic {base64.b64encode(self.secret.get_secret_value().encode()).decode()}"
        elif self.type == "token":
            auth_header = f"Bearer {self.secret.get_secret_value()}"

        return auth_header

secret property

secret

Load secret variable as 'secret'.

get_auth_header

get_auth_header()

Generate auth part of header for http request.

RETURNS	DESCRIPTION
`str`	Auth header. TYPE: `str`

Source code in docs/microservices/summary/src/models/llms.py

def get_auth_header(self) -> str:
    """Generate auth part of header for http request.

    Returns:
        str: Auth header.
    """
    auth_header = ""

    if self.type == "basic_auth":
        auth_header = f"Basic {base64.b64encode(self.secret.get_secret_value().encode()).decode()}"
    elif self.type == "token":
        auth_header = f"Bearer {self.secret.get_secret_value()}"

    return auth_header

LLM

Bases: BaseModel

Defines the basic structure of a LLM config.

ATTRIBUTE	DESCRIPTION
`label`	Human-readable model name that can be presented to users. TYPE: `str`
`model`	Model name which is used in API call, e.g. ollama tag. TYPE: `str`
`prompt_map`	Prompt map name to load LLMPromptMaps from. TYPE: `str`
`is_remote`	Is this LLM hosted at an external API? TYPE: `bool \| None`
`context_length`	Model's context length. TYPE: `PositiveInt`
`api`	API information. TYPE: `LLMAPI`
`inference`	Inference parameters. TYPE: `LLMInference \| None`
`prompt_config`	Prompts. TYPE: `LLMPromptConfig \| None`

Source code in docs/microservices/summary/src/models/llms.py

class LLM(BaseModel):
    """Defines the basic structure of a LLM config.

    Attributes:
        label (str): Human-readable model name that can be presented to users.
        model (str): Model name which is used in API call, e.g. ollama tag.
        prompt_map (str): Prompt map name to load LLMPromptMaps from.
        is_remote (bool | None): Is this LLM hosted at an external API?
        context_length (PositiveInt): Model's context length.
        api (LLMAPI): API information.
        inference (LLMInference | None): Inference parameters.
        prompt_config (LLMPromptConfig | None): Prompts.

    """

    label: str
    model: str
    prompt_map: str
    is_remote: bool | None = True
    context_length: PositiveInt
    api: LLMAPI
    inference: LLMInference | None = LLMInference()
    prompt_config: LLMPromptConfig | None = None

LLMAPI

Bases: BaseModel

Defines API-Connection to LLM.

ATTRIBUTE	DESCRIPTION
`url`	Url of the LLM. TYPE: `AnyHttpUrl`
`health_check`	Relative path to health check, i.e. '/models'. TYPE: `str \| None`
`auth`	Pydantic Model defining the authentication of the LLM. TYPE: `APIAuth \| None`

METHOD	DESCRIPTION
`get_health_check_url`	Get the URL to check if API is available.

Source code in docs/microservices/summary/src/models/llms.py

class LLMAPI(BaseModel):
    """Defines API-Connection to LLM.

    Attributes:
        url (AnyHttpUrl): Url of the LLM.
        health_check (str | None): Relative path to health check, i.e. '/models'.
        auth (APIAuth | None): Pydantic Model defining the authentication of the LLM.
    """

    url: AnyHttpUrl
    health_check: str | None = None
    auth: APIAuth | None = None

    def get_health_check_url(self) -> str:
        """Get the URL to check if API is available."""
        if self.health_check:
            # make sure to remove trailing and leading slashes to not override path
            return urljoin(
                str(self.url).rstrip("/") + "/",
                self.health_check.lstrip("/"),
            )
        return str(self.url)

get_health_check_url

get_health_check_url()

Get the URL to check if API is available.

Source code in docs/microservices/summary/src/models/llms.py

def get_health_check_url(self) -> str:
    """Get the URL to check if API is available."""
    if self.health_check:
        # make sure to remove trailing and leading slashes to not override path
        return urljoin(
            str(self.url).rstrip("/") + "/",
            self.health_check.lstrip("/"),
        )
    return str(self.url)

LLMConfig

Bases: BaseModel

Defines the LLMs used for summarization.

ATTRIBUTE	DESCRIPTION
`model_config`	Used to ignore other services, which are defined in the config. TYPE: `ConfigDict`
`summary`	Dictionary containing a name and definition of LLMs's available for summarization. TYPE: `dict[str, LLM] \| None`

Source code in docs/microservices/summary/src/models/llms.py

class LLMConfig(BaseModel):
    """Defines the LLMs used for summarization.

    Attributes:
        model_config (ConfigDict): Used to ignore other services, which are defined in the config.
        summary (dict[str, LLM] | None): Dictionary containing a name and definition of LLMs's
            available for summarization.
    """

    model_config = ConfigDict(extra="ignore")

    summary: dict[str, LLM] | None = []

    def __iter__(self) -> Iterator[str]:
        """Get 'keys' for automatic merge with i.e. LLMPromptConfig.

        Returns:
            Iterator[str]: keys
        """
        return iter(self.__dict__.keys())

    def __getitem__(self, service: str) -> dict[str, LLM]:
        """Get all LLMs for a given service (e.g. "summary", "rag").

        Args:
            service (str): The service name (e.g., "summary", "rag").

        Returns:
            dict[str, LLM]: All configered LLMs for the given service.
        """
        return self.__getattribute__(service)

LLMInference

Bases: BaseModel

Defines the inference parameters.

ATTRIBUTE	DESCRIPTION
`temperature`	Randomness / variation of the output High values indicate more creativity. Default is 0.1. TYPE: `PositiveFloat \| None`
`max_tokens`	Maximum number of tokens of the generated response. Default is 2048. TYPE: `PositiveInt \| None`
`top_p`	Threshold for sampling only from the most likely tokens. Default is 0.1. TYPE: `PositiveFloat \| None`
`timeout`	Maximal waiting time before request is canceled due to absent response. Default is 600. TYPE: `int`
`max_retries`	Number of request retries in case of failure. Default is 5. TYPE: `int \| None`
`frequency_penalty`	Likelihood of the model repeating the same phrases. Default is 0.1. TYPE: `float \| None`
`presence_penalty`	Penalizing tokens that have already appeared. Default is 0.1. TYPE: `float \| None`

Source code in docs/microservices/summary/src/models/llms.py

class LLMInference(BaseModel):
    """Defines the inference parameters.

    Attributes:
        temperature (PositiveFloat | None): Randomness / variation of the output High values indicate more creativity.
                                            Default is 0.1.
        max_tokens (PositiveInt | None): Maximum number of tokens of the generated response. Default is 2048.
        top_p (PositiveFloat | None): Threshold for sampling only from the most likely tokens. Default is 0.1.
        timeout (int): Maximal waiting time before request is canceled due to absent response. Default is 600.
        max_retries (int | None): Number of request retries in case of failure. Default is 5.
        frequency_penalty (float | None): Likelihood of the model repeating the same phrases. Default is 0.1.
        presence_penalty (float | None): Penalizing tokens that have already appeared. Default is 0.1.
    """

    temperature: PositiveFloat | None = 0.1
    max_tokens: PositiveInt | None = 2048
    top_p: float | None = 0.1
    timeout: PositiveInt = 600
    max_retries: PositiveInt | None = 3
    frequency_penalty: PositiveFloat | None = 0.1
    presence_penalty: PositiveFloat | None = 0.1

LLMPromptComponents

Bases: BaseModel

Defines the components of each prompt.

ATTRIBUTE	DESCRIPTION
`model_config`	Used to ignore other services, which are defined in the config. TYPE: `ConfigDict`
`main`	Prompts for MAP, REDUCE or FINAL step of summarization. It can be expanded by further instructions (e.g., focus_instructions) to activate additional summary features. TYPE: `str`
`focus_instructions`	Parts of MAP, REDUCE or FINAL prompt containing the instructions to focus the summary on topics specified by the user. TYPE: `str`

Source code in docs/microservices/summary/src/models/llms.py

class LLMPromptComponents(BaseModel):
    """Defines the components of each prompt.

    Attributes:
        model_config (ConfigDict): Used to ignore other services, which are defined in the config.

        main (str):  Prompts for MAP, REDUCE or FINAL step of summarization. It can be expanded by further instructions
            (e.g., focus_instructions) to activate additional summary features.
        focus_instructions (str): Parts of MAP, REDUCE or FINAL prompt containing the instructions to focus the summary
            on topics specified by the user.
    """

    model_config = ConfigDict(extra="ignore")

    main: str
    focus_instructions: str

LLMPromptConfig

Bases: BaseModel

Defines the structure of a LLM prompt configuration.

ATTRIBUTE	DESCRIPTION
`model_config`	Used to ignore other services, which are defined in the config. TYPE: `ConfigDict`
`system`	Prompt that defines the role and the rules for the LLMs behaviour. TYPE: `str \| None`
`user`	Prompt that gives specific instructions and user input to the LLM. TYPE: `str \| None`
`assistant`	Prompt that supports the user. TYPE: `str \| None`

Source code in docs/microservices/summary/src/models/llms.py

class LLMPromptConfig(BaseModel):
    """Defines the structure of a LLM prompt configuration.

    Attributes:
        model_config (ConfigDict): Used to ignore other services, which are defined in the config.
        system (str | None): Prompt that defines the role and the rules for the LLMs behaviour.
        user (str | None): Prompt that gives specific instructions and user input to the LLM.
        assistant (str | None): Prompt that supports the user.
    """

    model_config = ConfigDict(extra="ignore")

    system: LLMPrompts
    user: LLMPrompts | None = None
    assistant: LLMPrompts | None = None

LLMPromptMaps

Bases: BaseModel

Defines complete LLM prompt config.

ATTRIBUTE	DESCRIPTION
`model_config`	Used to ignore other services, which are defined in the config. TYPE: `ConfigDict`
`summary`	Dictionary containing a name and prompts of LLMs's available for summarization. TYPE: `dict[str, LLMPromptConfig]`

Source code in docs/microservices/summary/src/models/llms.py

class LLMPromptMaps(BaseModel):
    """Defines complete LLM prompt config.

    Attributes:
        model_config (ConfigDict): Used to ignore other services, which are defined in the config.
        summary (dict[str, LLMPromptConfig]): Dictionary containing a name and prompts of LLMs's available for
                                              summarization.
    """

    model_config = ConfigDict(extra="ignore")

    summary: dict[str, LLMPromptConfig]

    def __iter__(self) -> Iterator[str]:
        """Get 'keys' for automatic merge with i.e. LLMConfig.

        Returns:
            Iterator[str]: Keys
        """
        return iter(self.__dict__.keys())

LLMPrompts

Bases: BaseModel

Defines the prompts.

ATTRIBUTE	DESCRIPTION
`model_config`	Used to ignore other services, which are defined in the config. TYPE: `ConfigDict`
`map`	Prompt parameters of MAP prompt, which is used to generate a summary for a text-chunk. TYPE: `LLMPromptComponents`
`reduce`	Prompt parameters of REDUCE prompt, which is used to reduce the size of the intermediate summaries. TYPE: `LLMPromptComponents`
`final`	Prompt parameters of FINAL prompt, which is used to consolidate the intermediate summaries into a final summary. TYPE: `LLMPromptComponents`
`prepare_focus_topics`	Prompt used to extract the topics provided by users and transform them in to the proper format (string of topics separated by comma). TYPE: `str`

Source code in docs/microservices/summary/src/models/llms.py

class LLMPrompts(BaseModel):
    """Defines the prompts.

    Attributes:
        model_config (ConfigDict): Used to ignore other services, which are defined in the config.

        map (LLMPromptComponents): Prompt parameters of MAP prompt, which is used to generate a summary for a
            text-chunk.
        reduce (LLMPromptComponents): Prompt parameters of REDUCE prompt, which is used to reduce the size of the
            intermediate summaries.
        final (LLMPromptComponents): Prompt parameters of FINAL prompt, which is used to consolidate the intermediate
            summaries into a final summary.

        prepare_focus_topics (str): Prompt used to extract the topics provided by users and transform them in to the
            proper format (string of topics separated by comma).
    """

    model_config = ConfigDict(extra="ignore")

    map: LLMPromptComponents
    reduce: LLMPromptComponents
    final: LLMPromptComponents

    prepare_focus_topics: str

parser_output

Defines the output of the parsing microservice.

CLASS	DESCRIPTION
`Chunk`	Chunk of the parsed text incl. text and metadata.
`ChunkMetadata`	Defines the metadata of each chunk.
`ParsingOutput`	Defines a parsing output model used to hold parsed text and input for chunking.

Chunk

Bases: BaseModel

Chunk of the parsed text incl. text and metadata.

ATTRIBUTE	DESCRIPTION
`chunk_content`	Text content of this chunk. TYPE: `str`
`chunk_metadata`	Metadata of this chunk (as definied by ChunkMetadata). TYPE: `ChunkMetadata`

Source code in docs/microservices/summary/src/models/parser_output.py

class Chunk(BaseModel):
    """Chunk of the parsed text incl. text and metadata.

    Attributes:
        chunk_content (str): Text content of this chunk.
        chunk_metadata (ChunkMetadata): Metadata of this chunk (as definied by ChunkMetadata).
    """

    chunk_content: str
    chunk_metadata: ChunkMetadata

ChunkMetadata

Bases: BaseModel

Defines the metadata of each chunk.

ATTRIBUTE	DESCRIPTION
`chunk_number`	Id of the chunk. TYPE: `int`
`chunk_length`	Length of the chunks content as number of chars. TYPE: `int`
`filename`	The name of the original file or in case of text input "Texteingabe" or "Default Filename". TYPE: `str`
`filetype`	The type of the original file or in case of text input "string". TYPE: `str`
`headings`	List of headings. Empty if there is no heading to this chunk. TYPE: `list[str] \| None`
`pages`	List of pages within the original pdf document. None in case of chunks from text, txt or docx. TYPE: `list[int] \| None`
`locations`	Covering boundingbox and charspan of docling metadata for pdf files only. None in case of chunks from text, txt or docx. TYPE: `dict[str, Any] \| None`

Source code in docs/microservices/summary/src/models/parser_output.py

class ChunkMetadata(BaseModel):
    """Defines the metadata of each chunk.

    Attributes:
        chunk_number (int): Id of the chunk.
        chunk_length (int): Length of the chunks content as number of chars.
        filename (str): The name of the original file or in case of text input "Texteingabe" or "Default Filename".
        filetype (str): The type of the original file or in case of text input "string".
        headings (list[str] | None): List of headings. Empty if there is no heading to this chunk.
        pages (list[int] | None): List of pages within the original pdf document.
                                  None in case of chunks from text, txt or docx.
        locations (dict[str, Any] | None): Covering boundingbox and charspan of docling metadata for pdf files only.
                                           None in case of chunks from text, txt or docx.
    """

    chunk_number: int
    chunk_length: int
    filename: str
    filetype: str
    headings: list[str] | None
    pages: list[int] | None
    locations: list[dict[str, Any]] | None

ParsingOutput

Bases: BaseModel

Defines a parsing output model used to hold parsed text and input for chunking.

ATTRIBUTE	DESCRIPTION
`text`	Cleaned and parsed text. TYPE: `str`
`chunks`	Chunks TYPE: `list[Chunk]`
`warning_msg`	Message to the user containing information about the resulting text. TYPE: `str`
`model_config`	Used to ignore depricated parts of the ParsingOutput, which the summary does not use. TYPE: `ConfigDict`

Source code in docs/microservices/summary/src/models/parser_output.py

class ParsingOutput(BaseModel):
    """Defines a parsing output model used to hold parsed text and input for chunking.

    Attributes:
        text (str): Cleaned and parsed text.
        chunks (list[Chunk]): Chunks
        warning_msg (str, optional): Message to the user containing information about the resulting text.
        model_config (ConfigDict): Used to ignore depricated parts of the ParsingOutput, which the summary does not use.
    """

    model_config = ConfigDict(extra="ignore")
    text: str
    chunks: list[Chunk] | None = None
    warning_msg: str = ""

summary_parameters

Pydantic Models for parameters neccessary for the summarization.

CLASS	DESCRIPTION
`PromptParameters`	Prompt parameters for the MAP, REDUCE and FINAL prompt according to the topics stated by the user.
`SummarizeInput`	Defines input that is used for summarizing files.
`SummaryParameters`	Parameters needed for summarization, which are computed according the users input.

PromptParameters

Bases: BaseModel

Prompt parameters for the MAP, REDUCE and FINAL prompt according to the topics stated by the user.

These prompts cover instructions to focus and structure the summary on specific topics and to state if any topic is not covered by the input text.

These prompt parts are set by configure_prompt_parameters() in the summary registry. If there are no topics to focus on, the prompt parts will be set to empty strings by default.

ATTRIBUTE	DESCRIPTION
`focus_instructions_map`	Part of MAP-Prompt setting a focus on specific topics. TYPE: `str`
`focus_instructions_reduce`	Part of REDUCE-Prompt structuring the summary with respect to specific topics. TYPE: `str`
`focus_instructions_final`	Part of FINAL-Prompt checking if topics are not covered by the summary. TYPE: `str`

Source code in docs/microservices/summary/src/models/summary_parameters.py

class PromptParameters(BaseModel):
    """Prompt parameters for the MAP, REDUCE and FINAL prompt according to the topics stated by the user.

    These prompts cover instructions to focus and structure the summary on specific topics and to state if any topic is
    not covered by the input text.

    These prompt parts are set by configure_prompt_parameters() in the summary registry.
    If there are no topics to focus on, the prompt parts will be set to empty strings by default.

    Attributes:
        focus_instructions_map (str): Part of MAP-Prompt setting a focus on specific topics.
        focus_instructions_reduce (str): Part of REDUCE-Prompt structuring the summary with respect to specific topics.
        focus_instructions_final (str): Part of FINAL-Prompt checking if topics are not covered by the summary.
    """

    focus_instructions_map: str = ""
    focus_instructions_reduce: str = ""
    focus_instructions_final: str = ""

SummarizeInput

Bases: BaseModel

Defines input that is used for summarizing files.

ATTRIBUTE	DESCRIPTION
`language_model`	Name of the language model. TYPE: `str`
`chunks`	List of chunks with their content and metadata. TYPE: `list[Document]`
`summary_parameters`	Parameters needed for summarization (e.g. settings for length and focus topics), which are computed according the users input. TYPE: `SummaryParameters`

Source code in docs/microservices/summary/src/models/summary_parameters.py

class SummarizeInput(BaseModel):
    """Defines input that is used for summarizing files.

    Attributes:
        language_model (str): Name of the language model.
        chunks (list[Document]): List of chunks with their content and metadata.
        summary_parameters (SummaryParameters): Parameters needed for summarization (e.g. settings for length and focus
            topics), which are computed according the users input.
    """

    language_model: str
    chunks: list[Document]
    summary_parameters: SummaryParameters

SummaryParameters

Bases: BaseModel

Parameters needed for summarization, which are computed according the users input.

ATTRIBUTE	DESCRIPTION
`messages`	List of messages to the user, which regard the summary output. TYPE: `list[str]`
`desired_summary_chars`	Desired number of characters for the final summary output. TYPE: `NonNegativeInt)`
`max_input_chars`	Maximal number of input characters for the current summary set up. TYPE: `NonNegativeInt`
`prompt_parameters`	Additional instructions for MAP, REDUCE and FINAL prompt according to the topics set by the user. TYPE: `PromptParameters`

Source code in docs/microservices/summary/src/models/summary_parameters.py

class SummaryParameters(BaseModel):
    """Parameters needed for summarization, which are computed according the users input.

    Attributes:
        messages (list[str]): List of messages to the user, which regard the summary output.

        desired_summary_chars (NonNegativeInt) : Desired number of characters for the final summary output.
        max_input_chars (NonNegativeInt): Maximal number of input characters for the current summary set up.

        prompt_parameters (PromptParameters): Additional instructions for MAP, REDUCE and FINAL prompt according to the
                                              topics set by the user.
    """

    messages: list[str]

    # Output-Length Feature
    desired_summary_chars: NonNegativeInt
    max_input_chars: NonNegativeInt

    # Focus-Topic Feature
    prompt_parameters: PromptParameters = PromptParameters()

settings

Loads all settings from a central place, not hidden in utils.

summarizing

Implementation of the core logic of the summary.

MODULE	DESCRIPTION
`summary`	Summary class.
`summary_registry`	Summary Registry contains all summary pipelines.

summary

Summary class.

This class contains all important attributes and methods in order to generate a summary from chunks. The input chunks need to be langchain_core.documents.

The code is based on a guide from langchain (last access 2025-04-02): https://python.langchain.com/docs/versions/migrating_chains/map_reduce_chain/

CLASS	DESCRIPTION
`Summary`	Manages the summarization of documents with a LangGraph workflow.

Summary

Manages the summarization of documents with a LangGraph workflow.

It summarizes chunks, recursively reduces these summaries if necessary (depending on max_input_chars), generates a final consolidated summary.

ATTRIBUTE	DESCRIPTION
`graph`	The compiled LangGraph used to perform the map and reduce summarize approach. TYPE: `StateGraph`
`llm`	LLM that should be used to generate the summary. TYPE: `LLM`
`auth_client`	Authentication to connect with LLM API. TYPE: `CustomAuthClient`
`model_provider`	LLM with configured parameters. TYPE: `ChatOpenAI`
`map_chain`	Chain using the map prompt. TYPE: `Chain`
`reduce_chain`	Chain using the reduce prompt. TYPE: `Chain`
`final_chain`	Chain using the final prompt. TYPE: `Chain`

METHOD	DESCRIPTION
`calculate_max_input_chars`	Calculating the maximal number of characters of text input which should be summarized in one LLM call.
`configure_prompt_parameters`	Set parameters for the MAP, REDUCE and FINAL prompt according to users input.
`get_chunk_size`	Computes chuncksize minimum and maximum according to the specifications of the LLM and the input text length.
`get_info`	Returns pipeline information (label, name, placeholder, is_remote).
`load_basic_auth`	Load env-variable and check if it is missing. Split into username and password.
`load_secret`	Load env-variable and check if it is missing.
`process_desired_summary_length`	Processes the user input for the desired summary length.
`process_topic_input`	Processes the focus topics provided by the user by extracting the topics in the proper format.
`summarize`	Generates a summary for a list of chunks.

Source code in docs/microservices/summary/src/summarizing/summary.py

class Summary:
    """Manages the summarization of documents with a LangGraph workflow.

    It summarizes chunks, recursively reduces these summaries if necessary (depending on max_input_chars),
    generates a final consolidated summary.

    Attributes:
        graph (StateGraph): The compiled LangGraph used to perform the map and reduce summarize approach.
        llm (LLM): LLM that should be used to generate the summary.
        auth_client (CustomAuthClient): Authentication to connect with LLM API.
        model_provider (langchain_openai.llms.base.ChatOpenAI): LLM with configured parameters.
        map_chain (langchain_classic.chains.base.Chain): Chain using the map prompt.
        reduce_chain (langchain_classic.chains.base.Chain): Chain using the reduce prompt.
        final_chain (langchain_classic.chains.base.Chain): Chain using the final prompt.
    """

    def __init__(self, llm: LLM, llm_name: str) -> None:
        """Initializes the summarization pipeline.

        The initialization performs the set up of the class attributes: LLM name, LLM, auth client, model provider,
        the chains and the graph is constructed.
        """
        self.llm_name: str = llm_name
        self.llm: LLM = llm
        self._setup_auth_client()
        self._setup_model_provider()
        self._setup_chains()
        self._construct_graph()

    async def summarize(self, summarize_input: SummarizeInput) -> dict[str, Any]:
        """Generates a summary for a list of chunks.

        The content of each chunk is summarized by the map chain.
        These summaries are then collected. If the aggregated length of these summaries
        exceeds the maximum (max_input_chars), the summaries are summarized again using the reduce chain.
        This process is repeated until the condition is satisfied.
        Then the final summary is generated with the reduce chain.

        Args:
            summarize_input (SummarizeInput): Containing chunks incl. metadata, messages, prompt parameters and LLM.


        Returns:
            Contains the summary as str and messages to the user as list.
        """
        logger.info(f"Start summarizing {len(summarize_input.chunks)} chunks.")
        steps = []
        async for step in self.graph.astream(
            # initialize OverallState
            {
                "contents": [doc.page_content for doc in summarize_input.chunks],
                "focus_instructions_map": summarize_input.summary_parameters.prompt_parameters.focus_instructions_map,
                "focus_instructions_reduce": (
                    summarize_input.summary_parameters.prompt_parameters.focus_instructions_reduce
                ),
                "focus_instructions_final": (
                    summarize_input.summary_parameters.prompt_parameters.focus_instructions_final
                ),
                "desired_summary_chars": summarize_input.summary_parameters.desired_summary_chars,
                "max_input_chars": summarize_input.summary_parameters.max_input_chars,
                "messages": summarize_input.summary_parameters.messages,
                "num_reduce_call": 0,
                "quit_reducing": False,
            },
            {"recursion_limit": 40},
        ):
            steps.append(step)
        logger.debug("Finished summarizing.")
        summary = steps[-1].get("generate_final_summary").get("final_summary")
        messages = steps[-1].get("generate_final_summary").get("messages")

        return {"summary": summary, "messages": messages}

    def get_info(self) -> dict[str, Any]:
        """Returns pipeline information (label, name, placeholder, is_remote).

        Returns:
            Pipeline information (label, name, placeholder, is_remote).
        """
        return {
            "label": self.llm.label,
            "name": self.llm_name,
            "is_remote": self.llm.is_remote,
        }

    def calculate_max_input_chars(
        self, desired_summary_chars: int, remaining_context_length: int
    ) -> int:
        """Calculating the maximal number of characters of text input which should be summarized in one LLM call.

        The calculation is based on the specifications of the currently used LLM, which are the context length and
        the maximum length for generated output. These values are stated in the llms.yml.
        The context length needs to encompass the length of the system-prompt, the length of the text input and
        the length of the summary as output. Therefore the desired summary length set by the user is factored into
        the calculation to adapt the output length. Furthermore the context window should not be maxed out in order to
        ensure high quality summaries (therefore we only use 70% the theoretically remaining max input length).

        The desired output length (incl. additional 20% buffer) determines the max input length, if set.
        If the user does not set a desired length, the remaining characters are split by 70:30 for input length
        and output length.

        max_input_chars is used to determine the number of recursions of the reduce part of the summarization process.

        Args:
            desired_summary_chars (int): Desired summary length as number of chars.
            remaining_context_length (int): remaining context length after substracting all prompt lengths and a 30%
                buffer during max_input_chars estimation.

        Returns:
            Maximal number of input characters for the current summary set up.
        """
        if desired_summary_chars > 0:
            output_buffer = 1.2  # 20% extra as buffer for summary output length
            max_llm_output_chars = (
                self.llm.inference.max_tokens * 4
            )  # 1 token ~ 4 chars
            output_chars = min(
                int(desired_summary_chars * output_buffer),
                max_llm_output_chars,
            )
            max_input_chars = remaining_context_length - output_chars
            logger.debug(
                f"The max_input_chars are set to {max_input_chars} according to:"
                f"remaining_context_length {remaining_context_length} - output_chars {output_chars}."
                f"with remaining_context_length = 80% of (LLMs maximal number of input chars - the longest"
                f" prompt length (map/reduce/final prompt)) and with output_chars = min( desired summary length"
                f" in chars * 1.2 (buffer), maximal number of output chars of the LLM."
            )
        else:
            max_input_chars = int(remaining_context_length * 0.7)
            logger.debug(
                f"The max_input_chars are set to {max_input_chars} = 70% of remaining_context_length "
                f"{remaining_context_length} with remaining_context_length = 80% of (LLMs maximal number of "
                f"input chars - the longest prompt length (map/reduce/final prompt))."
            )

        return max_input_chars

    def process_desired_summary_length(
        self, desired_summary_length: dict[str, Any]
    ) -> dict[str, Any]:
        """Processes the user input for the desired summary length.

        Checks if the value is valid. If not, set to default value to ignore user input.
        Calculates the summary length in characters (desired_summary_chars) using an estimate of 4000 chars per page.

        Args:
            desired_summary_length (dict): Containing the information needed to compute the desired summary
                                    length from the two keys output_length (desired length of the summary output
                                    as number of pages) and input_length length of the summary input text
                                    (parsing output length as number of characters).

        Returns:
            Dictionary with following keys:
                - desired_summary_chars (int) : Desired number of characters for the final summary output.
                - messages (list[str]) : List of messages to the user regarding the summary.
        """
        messages = []
        # get number of characters for summary length parameters
        minimal_length = 500  # prevent hallucinations
        half_text_length = int(
            0.5 * desired_summary_length["input_length"]
        )  # 4000 chars ~ 1 DIN A4 page
        max_length = 60000  # 6000 chars ~ 15 DIN A4 pages

        if desired_summary_length["output_length"] <= 0:
            logger.warning("Using default summary length.")
            desired_summary_chars = 0
            messages.append(
                "Die Zusammenfassungslänge entspricht der Standardeinstellung."
            )
        else:
            desired_summary_chars = int(desired_summary_length["output_length"] * 4000)
            desired_summary_chars = max(minimal_length, desired_summary_chars)
            desired_summary_chars = min(
                half_text_length, desired_summary_chars, max_length
            )
            if desired_summary_chars == minimal_length:
                messages.append(
                    "Die erstellte Zusammenfassung weicht von der Ziellänge ab. "
                    "Mehr Informationen hierzu finden Sie in den FAQ."
                )
                logger.debug(
                    f"The desired summary length is set to the minimum of {minimal_length} chars."
                )
            elif desired_summary_chars == half_text_length:
                messages.append(
                    "Die erstellte Zusammenfassung weicht von der Ziellänge ab. "
                    "Mehr Informationen hierzu finden Sie in den FAQ."
                )
                logger.debug(
                    f"The desired summary length is set to the maximum of {half_text_length} chars, "
                    "which is half the input text length."
                )

            logger.info(
                f"The desired summary length is set to {desired_summary_chars} chars based on "
                f"desired {desired_summary_length['output_length']} pages (and "
                f"half input text length = {half_text_length} chars)."
            )

        return {
            "desired_summary_chars": desired_summary_chars,
            "messages": messages,
        }

    def get_chunk_size(self, text_length: int, max_input_chars: int) -> dict[str, int]:
        """Computes chuncksize minimum and maximum according to the specifications of the LLM and the input text length.

        Ensures that max_chunk_size does not exceed 20% of text length or 25% of max input length of the LLM (
        max_input_chars as number of characters). These 25% could be modified.
        Ensure minimal chunk size of 500 characters to avoid hallucinations and ensure proper range of chunk sizes by
        limiting min chunk size to 45% of max chunk size. The chunker demands the following ratio:
        min_chunk_size * 2 <= max_chunk_size.

        Args:
            text_length (int): Length of the parsed text.
            max_input_chars (int): Maximal number of input chars the LLM can process in one call.

        Returns:
            Dictionary containing the minimum and maximum chunk size as number of characters.
        """
        min_number_of_chunks = 5
        max_chunk_size = min(
            int(0.25 * max_input_chars),
            max(int(text_length / min_number_of_chunks), 500),
        )
        logger.debug(
            f"The maximal chunk size is set to {max_chunk_size} based on: "
            f"min(int(0.25 * max_input_chars), max(int(text_length/{min_number_of_chunks}), 500)) = min("
            f"{int(0.25 * max_input_chars)}, max({int(text_length / min_number_of_chunks)}, 500))."
        )

        min_chunk_size = max(int(0.5 * max_chunk_size), 500)
        logger.debug(
            f"The minimal chunk size is set to {min_chunk_size} based on: max(500, 0.45 * maximal chunk size) "
            f"(={int(0.45 * max_chunk_size)}))."
        )

        return {"min_chunk_size": min_chunk_size, "max_chunk_size": max_chunk_size}

    def configure_prompt_parameters(self, topics: str | None) -> PromptParameters:
        """Set parameters for the MAP, REDUCE and FINAL prompt according to users input.

        This covers instructions to focus on specific topics and stating if any topic is not covered by the input text.

        Args:
            topics (str | None): A comma-separated string listing the topics the summary should focus on.

        Returns:
            PromptParameters: Parts of MAP, REDUCE and FINAL Prompts containing the instructions to focus the summary on
                              topics, if specified by the user.
        """
        if topics:
            prompt_parameters = PromptParameters(
                focus_instructions_map=self.llm.prompt_config.system.map.focus_instructions.format(
                    topics=topics
                ),
                focus_instructions_reduce=self.llm.prompt_config.system.reduce.focus_instructions,
                focus_instructions_final=self.llm.prompt_config.system.final.focus_instructions.format(
                    topics=topics
                ),
            )
            logger.debug(
                f"MAP prompt part for topic focus instructions {prompt_parameters.focus_instructions_map=}.\n"
                f"REDUCE prompt part for topic focus instructions {prompt_parameters.focus_instructions_reduce}.\n"
                f"FINAL prompt part for topic focus instructions {prompt_parameters.focus_instructions_final}."
            )
        else:
            prompt_parameters = PromptParameters()
            logger.info(
                "The input field for focus topics is empty, so a general summary will be generated."
            )

        return prompt_parameters

    async def process_topic_input(self, topics: str) -> str:
        """Processes the focus topics provided by the user by extracting the topics in the proper format.

        This converts user input that does not meet the required format, such as short sentences, into a comma-separated
        string listing the topics the summary should focus on.

        Args:
            topics (str): User input text stating the topics the summary should focus on.

        Returns:
            str: A comma-separated string listing the topics the summary should focus on.
        """
        if topics.strip() != "":
            prep_topics_prompt_input = {"topics": "topics"}
            prep_topics_prompt = ChatPromptTemplate.from_template(
                template=self.llm.prompt_config.system.prepare_focus_topics,
                template_format="f-string",
                partial_variables=prep_topics_prompt_input,
            )
            logger.debug(f"Prompt for topic preparation: '{prep_topics_prompt}'.")
            self.prep_topics_chain = (
                prep_topics_prompt | self.model_provider | StrOutputParser()
            )
            topics = await self.prep_topics_chain.ainvoke({"topics": topics})
            logger.info(f"Topics after preparation: '{topics}'.")
        if topics.strip() == "":
            topics = None
        return topics

    def _setup_auth_client(self) -> None:
        """Set up authentication client for various APIs.

        Sets up an authentication client using either a token, credentials or no authentication method.
        Note: for Ollama usage no authentication method is needed.
        """
        if self.llm.api.auth:
            secret = self.load_secret(self.llm.api.auth.secret_path)
            auth_client = CustomAuthClient(
                secret=secret, auth_type=self.llm.api.auth.type
            )
        else:
            auth_client = CustomAuthClient()
        self.auth_client = auth_client

    def _setup_model_provider(self) -> None:
        """Set up LLM provider using OpenAI API.

        Initializing LLM with values form llms.yml
        """
        self.model_provider = ChatOpenAI(
            model_name=self.llm.model,
            http_async_client=self.auth_client,
            api_key=" ",
            base_url=str(self.llm.api.url),
            timeout=self.llm.inference.timeout,
            max_tokens=self.llm.inference.max_tokens,
            max_retries=self.llm.inference.max_retries,
            temperature=self.llm.inference.temperature,
            top_p=self.llm.inference.top_p,
            frequency_penalty=self.llm.inference.frequency_penalty,
            presence_penalty=self.llm.inference.presence_penalty,
            streaming=False,
        )

    def _construct_graph(self) -> None:
        """Creates the LangGraph for the summary. Using a Map and Reduce Approach."""
        graph = StateGraph(OverallState)
        # nodes / components
        graph.add_node("generate_summary", self._generate_summary)
        graph.add_node("collect_summaries", self._collect_summaries)
        graph.add_node("collapse_summaries", self._collapse_summaries)
        graph.add_node("generate_final_summary", self._generate_final_summary)
        # edges / connections
        graph.add_conditional_edges(START, self._map_summaries, ["generate_summary"])
        graph.add_edge("generate_summary", "collect_summaries")
        graph.add_conditional_edges("collect_summaries", self._should_collapse)
        graph.add_conditional_edges("collapse_summaries", self._should_collapse)
        graph.add_edge("generate_final_summary", END)

        self.graph = graph.compile()

    def _setup_chains(self) -> None:
        """Set up chains using map, reduce and final prompts and chains."""
        map_prompt_input = {
            "content": "content",
            "focus_instructions": "focus_instructions",
        }
        map_prompt = ChatPromptTemplate.from_template(
            template=self.llm.prompt_config.system.map.main,
            template_format="f-string",
            partial_variables=map_prompt_input,
        )
        logger.debug(f"MAP Prompt: '{map_prompt}'.")
        self.map_chain = map_prompt | self.model_provider | StrOutputParser()

        reduce_prompt_input = {
            "summaries": "summaries",
            "focus_instructions": "focus_instructions",
        }
        reduce_prompt = ChatPromptTemplate.from_template(
            template=self.llm.prompt_config.system.reduce.main,
            template_format="f-string",
            partial_variables=reduce_prompt_input,
        )
        logger.debug(f"REDUCE Prompt: '{reduce_prompt}'.")
        self.reduce_chain = reduce_prompt | self.model_provider | StrOutputParser()

        final_prompt_input = {
            "summaries": "summaries",
            "focus_instructions": "focus_instructions",
        }
        final_prompt = ChatPromptTemplate.from_template(
            template=self.llm.prompt_config.system.final.main,
            template_format="f-string",
            partial_variables=final_prompt_input,
        )
        logger.debug(f"FINAL Prompt: '{final_prompt}'.")
        self.final_chain = final_prompt | self.model_provider | StrOutputParser()

    def _map_summaries(self, state: OverallState) -> list:
        """Prepares the parallel summarization of each chunks contents.

        Args:
            state (OverallState): Contains the state of the summary process.

        Returns:
            List of processing steps as 'Send' objects. Each 'Send' object consists of the name of a node in the graph
            as well as the state to send to that node.
        """
        return [
            Send(
                "generate_summary",
                {
                    "content": content,
                    "focus_instructions_map": state["focus_instructions_map"],
                },
            )
            for content in state["contents"]
        ]

    async def _generate_summary(self, summary_state: SummaryState) -> dict[str, str]:
        """Generates a summary of a text section.

        Args:
            summary_state (SummaryState): State of the node which contains the content which we want to summarize
                                          and prompt parts as additional instructions according to the user input
                                          (e.g. setting a focus on specified topics).

        Returns:
            A dictionary with key "summaries", which contains a list of summaries resulting from the map prompt.
        """
        response = await self.map_chain.ainvoke(
            {
                "content": summary_state["content"],
                "focus_instructions": summary_state["focus_instructions_map"],
            }
        )
        logger.debug("LLM call with map prompt.")
        if response is None:
            response = self._retry_ainvoke(
                chain="map",
                chain_input={
                    "content": summary_state["content"],
                    "focus_instructions_map": summary_state["focus_instructions_map"],
                },
            )
        return {"summaries": [response]}

    def _collect_summaries(self, state: OverallState) -> dict[str, Any]:
        """Collect summaries from mapping step and store them in OverallState for collapse/reduce step.

        Args:
            state (OverallState): Contains the state of the summary process with summaries.

        Returns:
            Update of "collapsed summaries" as part of OverallState of the summary graph.
            "collapsed summaries" contains the list of summaries as Documents.
        """
        new_state = {
            "collapsed_summaries": [Document(summary) for summary in state["summaries"]]
        }
        return new_state

    def _should_collapse(
        self, state: OverallState
    ) -> Literal["collapse_summaries", "generate_final_summary"]:
        """Checks if the collected summaries are short enough to generate a final summary.

        If their total length exceeds the maximal length (max_input_chars) or if the desired summary length (with a
        tolerance of 25%) is exceeded the summaries need to be collapsed first to reduce their size.
        If previous reduction failed no further reduce loops are started and the graph will continue.

        Args:
            state : Contains the state of the summary process with summaries.

        Returns:
            Name of the next node.
        """
        logger.debug(
            f"The current number of reduce calls is '{state['num_reduce_call']}'."
        )

        length = self._length_function(documents=state["collapsed_summaries"])

        logger.debug(
            f"The intermediate summaries are currently {length} chars long, "
            f"with a desired summary length of {state['desired_summary_chars']} chars "
        )

        reduce_condition = False
        if (not state["quit_reducing"]) and state["desired_summary_chars"] != 0:
            tolerance = 1.25
            exceeded_desired_length = length > (
                tolerance * state["desired_summary_chars"]
            )
            reduce_condition = (
                length > state["max_input_chars"]
            ) or exceeded_desired_length
            logger.debug(
                "Does the current summary exceeded desired summary length with a tolerance of "
                f"{tolerance}? {exceeded_desired_length}"
            )
        else:
            reduce_condition = length > state["max_input_chars"]

        if (not state["quit_reducing"]) and reduce_condition:
            logger.debug(
                f"Start reduce-loop, because current {len(state['collapsed_summaries'])} summaries "
                f"are {length} characters long. And need to be < desired summary length "
                f"{state['desired_summary_chars']} and < max_input_chars {state['max_input_chars']}. "
                f"These conditions currently are {reduce_condition}."
            )
            return "collapse_summaries"

        else:
            if state["quit_reducing"]:
                logger.warning(
                    "Reduce-loop could not reduce summary length. Moving on to generation of final summary."
                )
            logger.info(
                f"Input for final summary generation: {len(state['collapsed_summaries'])} intermediate summaries with "
                f"a length of {length} characters"
            )
            return "generate_final_summary"

    async def _collapse_summaries(self, state: OverallState) -> dict[str, Any]:
        """Reduces a long list of summaries by further summarizing them using the reduce chain.

        Afterwards the list of summaries does not exceed max_input_chars anymore.

        Args:
            state (OverallState): Contains the state of the summary process.

        Returns:
            Update of the of OverallState of the summary graph for the following attributes:
                - "collapsed_summaries" containing a list of summaries as Documents,
                - "num_reduce_call" tracking the number of calls using the reduce prompt,
                - "quit_reducing" tracking wether this reduce loop was successful,
                - "message" a message containing relevant information for the user.
        """
        doc_lists = split_list_of_docs(
            state["collapsed_summaries"],
            self._length_function,
            state["max_input_chars"],
        )

        collapse_results = []
        num_reduce_calls = state["num_reduce_call"]
        for doc_list in doc_lists:
            collapse_results.append(
                await self._acollapse_docs(
                    docs=doc_list,
                    focus_instructions_reduce=state["focus_instructions_reduce"],
                )
            )
            num_reduce_calls += 1

        quit_reducing = not self._check_reduce_success(
            state=state, collapse_results=collapse_results
        )

        new_state = {
            "collapsed_summaries": collapse_results,
            "num_reduce_call": num_reduce_calls,
            "quit_reducing": quit_reducing,
            "messages": state["messages"],
        }

        return new_state

    async def _acollapse_docs(
        self, docs: list[Document], focus_instructions_reduce: str
    ) -> Document:
        """Execute a collapse function on a set of documents and merge their metadata.

        This function is a adapted version of acollapse_docs from langchain_classic.chains.combine_documents.reduce
        to avoid entering lists into the reduce prompt.

        All metadata values are strings, and in case of overlapping keys across the input documents the values
        get joined by ", ".

        Args:
            docs (list[Document]): A list of Documents to combine.
            focus_instructions_reduce (str) : REDUCE-Prompt part structuring the summary according to topics.

        Returns:
            A single Document with the output of reduce chain for the page content
            and the combined metadata's of all the input documents.
        """
        doc_contents = ""
        for doc in docs:
            doc_contents = doc_contents + doc.page_content + "\n\n"

        combined_content = await self.reduce_chain.ainvoke(
            {
                "summaries": doc_contents,
                "focus_instructions": focus_instructions_reduce,
            }
        )
        logger.debug(
            f"LLM call with reduce prompt: reducing {len(docs)} chunks into one chunk"
        )
        if combined_content is None:
            combined_content = self._retry_ainvoke(
                chain="reduce",
                chain_input={
                    "summaries": doc_contents,
                    "focus_instructions_reduce": focus_instructions_reduce,
                },
            )

        combined_metadata = {k: str(v) for k, v in docs[0].metadata.items()}
        for doc in docs[1:]:
            for k, v in doc.metadata.items():
                if k in combined_metadata:
                    combined_metadata[k] += f", {v}"
                else:
                    combined_metadata[k] = str(v)
        combined_doc = Document(
            page_content=combined_content, metadata=combined_metadata
        )
        return combined_doc

    def _length_function(self, documents: list[Document]) -> int:
        """Computes the cumulative length for a list of documents.

        To this end each documents content (excluding metadata) is measured and summed up.

        Args:
            documents (List[Document]): Each document consists of content and metadata.

        Returns:
            Total length of all documents contents.
        """
        return sum(len(doc.page_content) for doc in documents)

    def _check_reduce_success(
        self, state: OverallState, collapse_results: list[Document]
    ) -> bool:
        """Checks if _collapse_summaries was successful and returns a bool used for error handling.

        Args:
            state (OverallState): Contains the state of the summary process.
            collapse_results (list[Document]): List of summaries as documents after reduce step.

        Returns:
            True if reduce chain did successfuly reduce the length of the summaries.
            False if the reduce chain failed and needs to be quit to prevent infinite loops.
        """
        length_collapsed_summaries = self._length_function(documents=collapse_results)
        length_summaries_before = self._length_function(
            documents=state["collapsed_summaries"]
        )
        if length_collapsed_summaries >= length_summaries_before:
            quit_reducing = True
        else:
            quit_reducing = state["quit_reducing"]

        logger.debug(
            f"successfuly reduced length of intermediate summaries (length after collapse {length_collapsed_summaries}"
            f" = length before collapse {length_summaries_before}? -> "
            f"{length_collapsed_summaries == length_summaries_before}"
        )

        return not quit_reducing

    async def _generate_final_summary(self, state: OverallState) -> dict[str, Any]:
        """Generate the final summary from the collapsed summaries.

        Args:
            state (OverallState): Contains the state of the summary process with reduced summaries.

        Returns:
            New state as dictionary containing the final summary and a messages to the user.
        """
        messages = state["messages"]
        response = None

        if (
            len(state["collapsed_summaries"]) == 1
            and state["focus_instructions_final"] == ""
        ):
            # if there is just one Summary after REDUCE step use it as final summary
            logger.debug(
                "Skipping final prompt because we already have only 1 summary, which has the desired length."
            )
            response = state["collapsed_summaries"][0].page_content

        else:
            # gather input for final summary
            collapsed_summaries = ""
            for summary in state["collapsed_summaries"]:
                collapsed_summaries = (
                    collapsed_summaries + summary.page_content + "\n\n"
                )

            if (
                len("".join(collapsed_summaries.split())) < 1
                and state["focus_instructions"] != ""
            ):
                # handle empty summary in case of topic mismatch
                logger.warning(
                    "There is no infomation in the text matching the topics specified by the user."
                    "Therfore there is no summary to reply with."
                )
                response = (
                    "Der Text enthält keine Informationen zu den gewünschten Themen."
                )
                messages.append(
                    "Entferne die eingegebenen Themen und wiederhole deine Anfrage, "
                    "um eine allgemeine Zusammenfassung des Textes zu erhalten."
                )

            else:
                # generate final summary which considers the topics set by the user
                logger.debug(
                    f"Generate final summary from the {len(state['collapsed_summaries'])} intermediate summaries."
                )
                response = await self.final_chain.ainvoke(
                    {
                        "summaries": collapsed_summaries,
                        "focus_instructions": state["focus_instructions_final"],
                    }
                )
                logger.debug("LLM call with final prompt.")
                if response is None:
                    response = self._retry_ainvoke(
                        chain="final",
                        chain_input={
                            "summaries": collapsed_summaries,
                            "focus_instructions_final": state[
                                "focus_instructions_final"
                            ],
                        },
                    )

        if state["desired_summary_chars"] != 0:
            # handle mismatch of the desired summary length and the actual summary length
            not_reduced = state["num_reduce_call"] == 0
            summary_length = len(response)
            tolerance = 0.5
            too_short = summary_length < (
                state["desired_summary_chars"] * (1 - tolerance)
            )
            too_long = summary_length > (
                state["desired_summary_chars"] * (1 + tolerance)
            )
            logger.info(
                f"Final summary has a length of {summary_length}. Is this {too_short=} ...because chunk summaries "
                f"already have been to short? {too_short and not_reduced} or is it {too_long=} ... because the reduce "
                f"loop could not reduce any further? {state['quit_reducing']}."
            )
            if (not_reduced and too_short) or (state["quit_reducing"] and too_long):
                messages.append(
                    "Die erstellte Zusammenfassung weicht von der Ziellänge ab. "
                    "Mehr Informationen hierzu finden Sie in den FAQ."
                )

        new_state = {"final_summary": response, "messages": messages}

        return new_state

    async def _retry_ainvoke(self, chain: str, chain_input: dict) -> str:
        """This function performs a retry mechanism for LLM calls.

        It can be used with map chain, reduce chain or final chain.

        Args:
            chain (str): Selecting the chain that should be run (map, reduce or final).
            chain_input (dict): Input for map/reduce/final chain call.

        Returns:
            Resulting summary from map/reduce/final chain call.

        Raises:
            HTTPException: HTTP_502_BAD_GATEWAY raised if the selected language model did not respond after the maximal
                           number of retries.
        """
        response = None
        max_retries = self.llm.inference.max_retries
        for attempt_counter in range(max_retries):
            try:
                if chain == "map":
                    response = await self.map_chain.ainvoke(chain_input)
                elif chain == "reduce":
                    response = await self.reduce_chain.ainvoke(chain_input)
                elif chain == "final":
                    response = await self.final_chain.ainvoke(chain_input)
                else:
                    logger.error(
                        "Retry function did not get a valid chain. "
                        f"Got '{chain}' instead of either 'map', 'reduce' or 'final'."
                    )

                if response is not None:
                    break

            except Exception as e:
                if attempt_counter < max_retries - 1:
                    logger.warning(
                        f"LLM API did not respond. Retrying ({attempt_counter + 1} / {max_retries})."
                    )
                    await asyncio.sleep(1)
                else:
                    logger.critical(
                        f"LLM API did not respond after maximal number of retries. Giving up. Error: '{str(e)}'."
                    )
                    raise HTTPException(
                        status_code=status.HTTP_502_BAD_GATEWAY,
                        detail=(
                            "Aus technischen Gründen kann vorübergehend keine Zusammenfassung erstellt werden. "
                            "Bitte versuchen Sie es später erneut. "
                        ),
                    )
        return response

    @staticmethod
    def load_secret(file_path: str | Path) -> str:
        """Load env-variable and check if it is missing.

        Args:
            file_path (str | Path): Path of token file to be loaded.

        Returns:
            Loaded token

        Raises:
            Exception: Raised in case of an FileNotFoundError because the token can not be loaded.
        """
        try:
            with open(file_path) as f:
                token = f.read().splitlines()[0]
            return token
        except FileNotFoundError:
            logger.critical(f"Could not find Token - Check your folder: '{file_path}'")
            raise Exception

    @staticmethod
    def load_basic_auth(file_path: str | Path) -> HTTPBasicAuth:
        """Load env-variable and check if it is missing. Split into username and password.

        Args:
            file_path (str | Path): Path of token file to be loaded.

        Returns:
            Loaded username and password.

        Raises:
            ValueError: Raised in case the credentials can not be loaded.
        """
        credentials = Summary.load_secret(file_path)

        try:
            username, password = credentials.split(":")
        except ValueError as e:
            logger.debug(
                "Credentials could not be loaded. Please check the credentials."
                "Hint: credentials should be in the format 'username:password'."
                f" Error: {str(e)}"
            )
            raise ValueError(
                "Unable to establish connection: Invalid credentials format."
            )

        return HTTPBasicAuth(username, password)

calculate_max_input_chars

calculate_max_input_chars(desired_summary_chars, remaining_context_length)

Calculating the maximal number of characters of text input which should be summarized in one LLM call.

The calculation is based on the specifications of the currently used LLM, which are the context length and the maximum length for generated output. These values are stated in the llms.yml. The context length needs to encompass the length of the system-prompt, the length of the text input and the length of the summary as output. Therefore the desired summary length set by the user is factored into the calculation to adapt the output length. Furthermore the context window should not be maxed out in order to ensure high quality summaries (therefore we only use 70% the theoretically remaining max input length).

The desired output length (incl. additional 20% buffer) determines the max input length, if set. If the user does not set a desired length, the remaining characters are split by 70:30 for input length and output length.

max_input_chars is used to determine the number of recursions of the reduce part of the summarization process.

PARAMETER	DESCRIPTION
`desired_summary_chars`	Desired summary length as number of chars. TYPE: `int`
`remaining_context_length`	remaining context length after substracting all prompt lengths and a 30% buffer during max_input_chars estimation. TYPE: `int`

RETURNS	DESCRIPTION
`int`	Maximal number of input characters for the current summary set up.

Source code in docs/microservices/summary/src/summarizing/summary.py

def calculate_max_input_chars(
    self, desired_summary_chars: int, remaining_context_length: int
) -> int:
    """Calculating the maximal number of characters of text input which should be summarized in one LLM call.

    The calculation is based on the specifications of the currently used LLM, which are the context length and
    the maximum length for generated output. These values are stated in the llms.yml.
    The context length needs to encompass the length of the system-prompt, the length of the text input and
    the length of the summary as output. Therefore the desired summary length set by the user is factored into
    the calculation to adapt the output length. Furthermore the context window should not be maxed out in order to
    ensure high quality summaries (therefore we only use 70% the theoretically remaining max input length).

    The desired output length (incl. additional 20% buffer) determines the max input length, if set.
    If the user does not set a desired length, the remaining characters are split by 70:30 for input length
    and output length.

    max_input_chars is used to determine the number of recursions of the reduce part of the summarization process.

    Args:
        desired_summary_chars (int): Desired summary length as number of chars.
        remaining_context_length (int): remaining context length after substracting all prompt lengths and a 30%
            buffer during max_input_chars estimation.

    Returns:
        Maximal number of input characters for the current summary set up.
    """
    if desired_summary_chars > 0:
        output_buffer = 1.2  # 20% extra as buffer for summary output length
        max_llm_output_chars = (
            self.llm.inference.max_tokens * 4
        )  # 1 token ~ 4 chars
        output_chars = min(
            int(desired_summary_chars * output_buffer),
            max_llm_output_chars,
        )
        max_input_chars = remaining_context_length - output_chars
        logger.debug(
            f"The max_input_chars are set to {max_input_chars} according to:"
            f"remaining_context_length {remaining_context_length} - output_chars {output_chars}."
            f"with remaining_context_length = 80% of (LLMs maximal number of input chars - the longest"
            f" prompt length (map/reduce/final prompt)) and with output_chars = min( desired summary length"
            f" in chars * 1.2 (buffer), maximal number of output chars of the LLM."
        )
    else:
        max_input_chars = int(remaining_context_length * 0.7)
        logger.debug(
            f"The max_input_chars are set to {max_input_chars} = 70% of remaining_context_length "
            f"{remaining_context_length} with remaining_context_length = 80% of (LLMs maximal number of "
            f"input chars - the longest prompt length (map/reduce/final prompt))."
        )

    return max_input_chars

configure_prompt_parameters

configure_prompt_parameters(topics)

Set parameters for the MAP, REDUCE and FINAL prompt according to users input.

This covers instructions to focus on specific topics and stating if any topic is not covered by the input text.

PARAMETER	DESCRIPTION
`topics`	A comma-separated string listing the topics the summary should focus on. TYPE: `str \| None`

RETURNS	DESCRIPTION
`PromptParameters`	Parts of MAP, REDUCE and FINAL Prompts containing the instructions to focus the summary on topics, if specified by the user. TYPE: `PromptParameters`

Source code in docs/microservices/summary/src/summarizing/summary.py

def configure_prompt_parameters(self, topics: str | None) -> PromptParameters:
    """Set parameters for the MAP, REDUCE and FINAL prompt according to users input.

    This covers instructions to focus on specific topics and stating if any topic is not covered by the input text.

    Args:
        topics (str | None): A comma-separated string listing the topics the summary should focus on.

    Returns:
        PromptParameters: Parts of MAP, REDUCE and FINAL Prompts containing the instructions to focus the summary on
                          topics, if specified by the user.
    """
    if topics:
        prompt_parameters = PromptParameters(
            focus_instructions_map=self.llm.prompt_config.system.map.focus_instructions.format(
                topics=topics
            ),
            focus_instructions_reduce=self.llm.prompt_config.system.reduce.focus_instructions,
            focus_instructions_final=self.llm.prompt_config.system.final.focus_instructions.format(
                topics=topics
            ),
        )
        logger.debug(
            f"MAP prompt part for topic focus instructions {prompt_parameters.focus_instructions_map=}.\n"
            f"REDUCE prompt part for topic focus instructions {prompt_parameters.focus_instructions_reduce}.\n"
            f"FINAL prompt part for topic focus instructions {prompt_parameters.focus_instructions_final}."
        )
    else:
        prompt_parameters = PromptParameters()
        logger.info(
            "The input field for focus topics is empty, so a general summary will be generated."
        )

    return prompt_parameters

get_chunk_size

get_chunk_size(text_length, max_input_chars)

Computes chuncksize minimum and maximum according to the specifications of the LLM and the input text length.

Ensures that max_chunk_size does not exceed 20% of text length or 25% of max input length of the LLM ( max_input_chars as number of characters). These 25% could be modified. Ensure minimal chunk size of 500 characters to avoid hallucinations and ensure proper range of chunk sizes by limiting min chunk size to 45% of max chunk size. The chunker demands the following ratio: min_chunk_size * 2 <= max_chunk_size.

PARAMETER	DESCRIPTION
`text_length`	Length of the parsed text. TYPE: `int`
`max_input_chars`	Maximal number of input chars the LLM can process in one call. TYPE: `int`

RETURNS	DESCRIPTION
`dict[str, int]`	Dictionary containing the minimum and maximum chunk size as number of characters.

Source code in docs/microservices/summary/src/summarizing/summary.py

def get_chunk_size(self, text_length: int, max_input_chars: int) -> dict[str, int]:
    """Computes chuncksize minimum and maximum according to the specifications of the LLM and the input text length.

    Ensures that max_chunk_size does not exceed 20% of text length or 25% of max input length of the LLM (
    max_input_chars as number of characters). These 25% could be modified.
    Ensure minimal chunk size of 500 characters to avoid hallucinations and ensure proper range of chunk sizes by
    limiting min chunk size to 45% of max chunk size. The chunker demands the following ratio:
    min_chunk_size * 2 <= max_chunk_size.

    Args:
        text_length (int): Length of the parsed text.
        max_input_chars (int): Maximal number of input chars the LLM can process in one call.

    Returns:
        Dictionary containing the minimum and maximum chunk size as number of characters.
    """
    min_number_of_chunks = 5
    max_chunk_size = min(
        int(0.25 * max_input_chars),
        max(int(text_length / min_number_of_chunks), 500),
    )
    logger.debug(
        f"The maximal chunk size is set to {max_chunk_size} based on: "
        f"min(int(0.25 * max_input_chars), max(int(text_length/{min_number_of_chunks}), 500)) = min("
        f"{int(0.25 * max_input_chars)}, max({int(text_length / min_number_of_chunks)}, 500))."
    )

    min_chunk_size = max(int(0.5 * max_chunk_size), 500)
    logger.debug(
        f"The minimal chunk size is set to {min_chunk_size} based on: max(500, 0.45 * maximal chunk size) "
        f"(={int(0.45 * max_chunk_size)}))."
    )

    return {"min_chunk_size": min_chunk_size, "max_chunk_size": max_chunk_size}

get_info

get_info()

Returns pipeline information (label, name, placeholder, is_remote).

RETURNS	DESCRIPTION
`dict[str, Any]`	Pipeline information (label, name, placeholder, is_remote).

Source code in docs/microservices/summary/src/summarizing/summary.py

def get_info(self) -> dict[str, Any]:
    """Returns pipeline information (label, name, placeholder, is_remote).

    Returns:
        Pipeline information (label, name, placeholder, is_remote).
    """
    return {
        "label": self.llm.label,
        "name": self.llm_name,
        "is_remote": self.llm.is_remote,
    }

load_basic_auth staticmethod

load_basic_auth(file_path)

Load env-variable and check if it is missing. Split into username and password.

PARAMETER	DESCRIPTION
`file_path`	Path of token file to be loaded. TYPE: `str \| Path`

RETURNS	DESCRIPTION
`HTTPBasicAuth`	Loaded username and password.

RAISES	DESCRIPTION
`ValueError`	Raised in case the credentials can not be loaded.

Source code in docs/microservices/summary/src/summarizing/summary.py

@staticmethod
def load_basic_auth(file_path: str | Path) -> HTTPBasicAuth:
    """Load env-variable and check if it is missing. Split into username and password.

    Args:
        file_path (str | Path): Path of token file to be loaded.

    Returns:
        Loaded username and password.

    Raises:
        ValueError: Raised in case the credentials can not be loaded.
    """
    credentials = Summary.load_secret(file_path)

    try:
        username, password = credentials.split(":")
    except ValueError as e:
        logger.debug(
            "Credentials could not be loaded. Please check the credentials."
            "Hint: credentials should be in the format 'username:password'."
            f" Error: {str(e)}"
        )
        raise ValueError(
            "Unable to establish connection: Invalid credentials format."
        )

    return HTTPBasicAuth(username, password)

load_secret staticmethod

load_secret(file_path)

Load env-variable and check if it is missing.

PARAMETER	DESCRIPTION
`file_path`	Path of token file to be loaded. TYPE: `str \| Path`

RETURNS	DESCRIPTION
`str`	Loaded token

RAISES	DESCRIPTION
`Exception`	Raised in case of an FileNotFoundError because the token can not be loaded.

Source code in docs/microservices/summary/src/summarizing/summary.py

@staticmethod
def load_secret(file_path: str | Path) -> str:
    """Load env-variable and check if it is missing.

    Args:
        file_path (str | Path): Path of token file to be loaded.

    Returns:
        Loaded token

    Raises:
        Exception: Raised in case of an FileNotFoundError because the token can not be loaded.
    """
    try:
        with open(file_path) as f:
            token = f.read().splitlines()[0]
        return token
    except FileNotFoundError:
        logger.critical(f"Could not find Token - Check your folder: '{file_path}'")
        raise Exception

process_desired_summary_length

process_desired_summary_length(desired_summary_length)

Processes the user input for the desired summary length.

Checks if the value is valid. If not, set to default value to ignore user input. Calculates the summary length in characters (desired_summary_chars) using an estimate of 4000 chars per page.

PARAMETER	DESCRIPTION
`desired_summary_length`	Containing the information needed to compute the desired summary length from the two keys output_length (desired length of the summary output as number of pages) and input_length length of the summary input text (parsing output length as number of characters). TYPE: `dict`

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with following keys: - desired_summary_chars (int) : Desired number of characters for the final summary output. - messages (list[str]) : List of messages to the user regarding the summary.

Source code in docs/microservices/summary/src/summarizing/summary.py

def process_desired_summary_length(
    self, desired_summary_length: dict[str, Any]
) -> dict[str, Any]:
    """Processes the user input for the desired summary length.

    Checks if the value is valid. If not, set to default value to ignore user input.
    Calculates the summary length in characters (desired_summary_chars) using an estimate of 4000 chars per page.

    Args:
        desired_summary_length (dict): Containing the information needed to compute the desired summary
                                length from the two keys output_length (desired length of the summary output
                                as number of pages) and input_length length of the summary input text
                                (parsing output length as number of characters).

    Returns:
        Dictionary with following keys:
            - desired_summary_chars (int) : Desired number of characters for the final summary output.
            - messages (list[str]) : List of messages to the user regarding the summary.
    """
    messages = []
    # get number of characters for summary length parameters
    minimal_length = 500  # prevent hallucinations
    half_text_length = int(
        0.5 * desired_summary_length["input_length"]
    )  # 4000 chars ~ 1 DIN A4 page
    max_length = 60000  # 6000 chars ~ 15 DIN A4 pages

    if desired_summary_length["output_length"] <= 0:
        logger.warning("Using default summary length.")
        desired_summary_chars = 0
        messages.append(
            "Die Zusammenfassungslänge entspricht der Standardeinstellung."
        )
    else:
        desired_summary_chars = int(desired_summary_length["output_length"] * 4000)
        desired_summary_chars = max(minimal_length, desired_summary_chars)
        desired_summary_chars = min(
            half_text_length, desired_summary_chars, max_length
        )
        if desired_summary_chars == minimal_length:
            messages.append(
                "Die erstellte Zusammenfassung weicht von der Ziellänge ab. "
                "Mehr Informationen hierzu finden Sie in den FAQ."
            )
            logger.debug(
                f"The desired summary length is set to the minimum of {minimal_length} chars."
            )
        elif desired_summary_chars == half_text_length:
            messages.append(
                "Die erstellte Zusammenfassung weicht von der Ziellänge ab. "
                "Mehr Informationen hierzu finden Sie in den FAQ."
            )
            logger.debug(
                f"The desired summary length is set to the maximum of {half_text_length} chars, "
                "which is half the input text length."
            )

        logger.info(
            f"The desired summary length is set to {desired_summary_chars} chars based on "
            f"desired {desired_summary_length['output_length']} pages (and "
            f"half input text length = {half_text_length} chars)."
        )

    return {
        "desired_summary_chars": desired_summary_chars,
        "messages": messages,
    }

process_topic_input async

process_topic_input(topics)

Processes the focus topics provided by the user by extracting the topics in the proper format.

This converts user input that does not meet the required format, such as short sentences, into a comma-separated string listing the topics the summary should focus on.

PARAMETER	DESCRIPTION
`topics`	User input text stating the topics the summary should focus on. TYPE: `str`

RETURNS	DESCRIPTION
`str`	A comma-separated string listing the topics the summary should focus on. TYPE: `str`

Source code in docs/microservices/summary/src/summarizing/summary.py

async def process_topic_input(self, topics: str) -> str:
    """Processes the focus topics provided by the user by extracting the topics in the proper format.

    This converts user input that does not meet the required format, such as short sentences, into a comma-separated
    string listing the topics the summary should focus on.

    Args:
        topics (str): User input text stating the topics the summary should focus on.

    Returns:
        str: A comma-separated string listing the topics the summary should focus on.
    """
    if topics.strip() != "":
        prep_topics_prompt_input = {"topics": "topics"}
        prep_topics_prompt = ChatPromptTemplate.from_template(
            template=self.llm.prompt_config.system.prepare_focus_topics,
            template_format="f-string",
            partial_variables=prep_topics_prompt_input,
        )
        logger.debug(f"Prompt for topic preparation: '{prep_topics_prompt}'.")
        self.prep_topics_chain = (
            prep_topics_prompt | self.model_provider | StrOutputParser()
        )
        topics = await self.prep_topics_chain.ainvoke({"topics": topics})
        logger.info(f"Topics after preparation: '{topics}'.")
    if topics.strip() == "":
        topics = None
    return topics

summarize async

summarize(summarize_input)

Generates a summary for a list of chunks.

The content of each chunk is summarized by the map chain. These summaries are then collected. If the aggregated length of these summaries exceeds the maximum (max_input_chars), the summaries are summarized again using the reduce chain. This process is repeated until the condition is satisfied. Then the final summary is generated with the reduce chain.

PARAMETER	DESCRIPTION
`summarize_input`	Containing chunks incl. metadata, messages, prompt parameters and LLM. TYPE: `SummarizeInput`

RETURNS	DESCRIPTION
`dict[str, Any]`	Contains the summary as str and messages to the user as list.

Source code in docs/microservices/summary/src/summarizing/summary.py

async def summarize(self, summarize_input: SummarizeInput) -> dict[str, Any]:
    """Generates a summary for a list of chunks.

    The content of each chunk is summarized by the map chain.
    These summaries are then collected. If the aggregated length of these summaries
    exceeds the maximum (max_input_chars), the summaries are summarized again using the reduce chain.
    This process is repeated until the condition is satisfied.
    Then the final summary is generated with the reduce chain.

    Args:
        summarize_input (SummarizeInput): Containing chunks incl. metadata, messages, prompt parameters and LLM.


    Returns:
        Contains the summary as str and messages to the user as list.
    """
    logger.info(f"Start summarizing {len(summarize_input.chunks)} chunks.")
    steps = []
    async for step in self.graph.astream(
        # initialize OverallState
        {
            "contents": [doc.page_content for doc in summarize_input.chunks],
            "focus_instructions_map": summarize_input.summary_parameters.prompt_parameters.focus_instructions_map,
            "focus_instructions_reduce": (
                summarize_input.summary_parameters.prompt_parameters.focus_instructions_reduce
            ),
            "focus_instructions_final": (
                summarize_input.summary_parameters.prompt_parameters.focus_instructions_final
            ),
            "desired_summary_chars": summarize_input.summary_parameters.desired_summary_chars,
            "max_input_chars": summarize_input.summary_parameters.max_input_chars,
            "messages": summarize_input.summary_parameters.messages,
            "num_reduce_call": 0,
            "quit_reducing": False,
        },
        {"recursion_limit": 40},
    ):
        steps.append(step)
    logger.debug("Finished summarizing.")
    summary = steps[-1].get("generate_final_summary").get("final_summary")
    messages = steps[-1].get("generate_final_summary").get("messages")

    return {"summary": summary, "messages": messages}

summary_registry

Summary Registry contains all summary pipelines.

CLASS	DESCRIPTION
`SummaryRegistry`	Manages summary pipelines including parameter configurations.

SummaryRegistry

Manages summary pipelines including parameter configurations.

ATTRIBUTE	DESCRIPTION
`summary_pipelines`	: A list containing all pipelines (one for each available LLM). TYPE: `list[Summary]`

METHOD	DESCRIPTION
`configure_summary_parameters`	Computes parameter based on parsing output and user input to finish initialization of the summary pipeline.
`estimate_max_input_chars`	Calculating the maximal number of characters of text input which should be summarized in one LLM call.
`request_summary`	Start the summarization of chunks according to the given parameters.

Source code in docs/microservices/summary/src/summarizing/summary_registry.py

class SummaryRegistry:
    """Manages summary pipelines including parameter configurations.

    Attributes:
        summary_pipelines (list[Summary]):: A list containing all pipelines (one for each available LLM).
    """

    def __init__(
        self,
    ) -> None:
        """Initializes the list of Summary instances."""
        self.summary_pipelines = self._initialize_pipelines()

    def _initialize_pipelines(self) -> list[Summary]:
        """Load all available summary pipelines based on custom configuration.

        Returns:
            A list containing all pipelines (one for each available LLM).
        """
        pipelines = []
        for llm_name, llm in llm_config.summary.items():
            pipelines.append(Summary(llm, llm_name))

        logger.info(f"Initialized {len(pipelines)} summary pipelines.")
        return pipelines

    async def request_summary(
        self,
        summarize_input: SummarizeInput,
        text: str,
    ) -> SummaryAPIOutput:
        """Start the summarization of chunks according to the given parameters.

        Args:
            summarize_input (SummarizeInput): Containing chunks incl. metadata, messages, prompt_parameters and LLM.
            text (str): Parsed input file or  text.

        Returns:
            Contains the summary, a message to the user and the parsing output.

        Raises:
            HTTPException:  HTTP_424_FAILED_DEPENDENCY raised if the text could not be summarized due to missing chunks.
                            HTTP_400_BAD_REQUEST raised if the selected language model was invalid.
        """
        # errorhandling for missing chunks
        if not summarize_input.chunks:
            logger.error(
                "No summarization possible because there are no chunks to summarize."
            )
            raise HTTPException(
                status_code=status.HTTP_424_FAILED_DEPENDENCY,
                detail=(
                    "Der Text konnte nicht verarbeitet werden. Bitte versuchen Sie es mit einer anderen Datei."
                ),
            )

        # select and start pipeline
        summary_result = {}
        for pipeline in self.summary_pipelines:
            if pipeline.llm_name == summarize_input.language_model:
                summary_pipeline = pipeline
                summary_result = await summary_pipeline.summarize(
                    summarize_input=summarize_input,
                )
                break

        # errorhandling for missing pipeline
        if not summary_pipeline:
            logger.error(
                f"Warning: Invalid 'language_model' requested: {summarize_input.language_model}."
            )
            raise HTTPException(
                status_code=status.HTTP_400_BAD_REQUEST,
                detail=(
                    f"Mit dem ausgewählten Sprachmodell {summarize_input.language_model} konnte keine Zusammenfassung "
                    "generiert werden. Bitte versuchen Sie es mit einem anderen Sprachmodell."
                ),
            )

        return SummaryAPIOutput(
            summary=summary_result["summary"],
            warning_msg=" ".join(set(summary_result["messages"])),
            parsed_text=text,
        )

    async def configure_summary_parameters(
        self,
        warning_msg: list[str],
        language_model: str,
        remaining_context_length: int,
        desired_summary_length: dict,
        topics: str | None,
    ) -> SummaryParameters:
        """Computes parameter based on parsing output and user input to finish initialization of the summary pipeline.

        Computes chuncksize minimum and maximum according to the specifications of the LLM and the input text length.
        Calculating the maximal input length (max_input_chars) which should be summarized in one LLM call,
        otherwise the summary needs to perform another recursion of its reduce part.
        The desired summary length in chars is needed as reduce-loop criterion and to influence the length of the LLM
        response to the map and reduce prompt.

        Args:
            warning_msg (list[str]): List of messages to the user, which regard the summary output.
            language_model (str): Name of the LLM as stated in llms.yml
            desired_summary_length (dict): Containing the information needed to compute the desired summary
                                           length from the two keys output_length (float) and input_length (int).
            remaining_context_length (int): Remaining context length after substracting all prompt lengths and a 30%
                buffer during max_input_chars estimation.
            topics (str | None): User input text stating the topics the summary should focus on.

        Returns:
            SummaryParameters: Parameters needed for summarization, which are computed according the users input.

        Raises:
            HTTPException: HTTP_400_BAD_REQUEST raised if the selected language model was invalid.
        """
        summary_pipeline = None
        messages = warning_msg

        # get summary pipeline for calculations
        for pipeline in self.summary_pipelines:
            if pipeline.llm_name == language_model:
                summary_pipeline = pipeline
                break

        if not summary_pipeline:
            logger.error(
                "Parameters can not be computed because there is no summary pipeline initialized, "
                f"which uses {language_model}."
            )
            raise HTTPException(
                status_code=status.HTTP_400_BAD_REQUEST,
                detail=(
                    f"Es wurde ein ungültiges Sprachmodell ausgewählt: {language_model}."
                    " Bitte versuchen Sie es mit einem anderen Sprachmodell."
                ),
            )

        # prepare topics and set them as focus for summarization
        if topics:
            topics = await summary_pipeline.process_topic_input(topics=topics)
        prompt_parameters = summary_pipeline.configure_prompt_parameters(topics=topics)

        # compute desired summary length and maximum of input characters
        summary_length_parameters = summary_pipeline.process_desired_summary_length(
            desired_summary_length=desired_summary_length
        )

        messages.extend(summary_length_parameters["messages"])

        max_input_chars = summary_pipeline.calculate_max_input_chars(
            desired_summary_chars=summary_length_parameters["desired_summary_chars"],
            remaining_context_length=remaining_context_length,
        )

        summary_parameters = SummaryParameters(
            messages=messages,
            desired_summary_chars=summary_length_parameters["desired_summary_chars"],
            max_input_chars=max_input_chars,
            prompt_parameters=prompt_parameters,
        )

        logger.debug(
            f"All parameters for the summarization have been computed: '{summary_parameters}'"
        )

        return summary_parameters

    def estimate_max_input_chars(
        self,
        language_model: str,
    ) -> int:
        """Calculating the maximal number of characters of text input which should be summarized in one LLM call.

        The calculation is based on the context length of the currently used LLM (stated in the llms.yml).
        The context length needs to encompass the length of the system-prompt, the length of the text input and
        the length of the summary as output.

        A estimate of 30% for text input and summary output is used.

        Furthermore the context window should not be maxed out in order to ensure high quality summaries (therefore we
        only use 70% the theoretically remaining max input length).

        Args:
            language_model (str): Name of the LLM as stated in llms.yml

        Returns:
            dict: containing a estimation of the maximal number of input characters for the current summary set up
                (which will be used to compute the ideal chunksize) and the remaining context length after substracting
                all prompt lengths and a 30% buffer (which will be used to refine the aximal number of input characters
                for further usage in the reduce loop of the summarization)
        """
        # get summary pipeline for calculations
        summary_pipeline = None
        for pipeline in self.summary_pipelines:
            if pipeline.llm_name == language_model:
                summary_pipeline = pipeline
                break
        if not summary_pipeline:
            logger.error(
                "Parameters can not be computed because there is no summary pipeline initialized, "
                f"which uses {language_model}."
            )
            raise HTTPException(
                status_code=status.HTTP_400_BAD_REQUEST,
                detail=(
                    f"Es wurde ein ungültiges Sprachmodell ausgewählt: {language_model}."
                    " Bitte versuchen Sie es mit einem anderen Sprachmodell."
                ),
            )

        max_prompt_length = max(
            (len(prompt.main) + len(prompt.focus_instructions))
            for prompt in [
                summary_pipeline.llm.prompt_config.system.map,
                summary_pipeline.llm.prompt_config.system.reduce,
                summary_pipeline.llm.prompt_config.system.final,
            ]
        )
        context_length_chars = (
            summary_pipeline.llm.context_length * 4
        )  # 1 token ~ 4 chars
        remaining_context_length = int(
            (context_length_chars - max_prompt_length)
            * 0.7  # cap at 70% context-length
        )

        max_input_chars = int(remaining_context_length * 0.7)
        logger.debug(
            f"The max_input_chars are set to {max_input_chars} = 70% of remaining_context_length "
            f"{remaining_context_length} with remaining_context_length = 70% of (LLMs maximal number of "
            f"input chars - the longest prompt length (map/reduce/final prompt))."
        )

        return {
            "max_input_chars_estimate": max_input_chars,
            "remaining_context_length": remaining_context_length,
        }

configure_summary_parameters async

configure_summary_parameters(warning_msg, language_model, remaining_context_length, desired_summary_length, topics)

Computes parameter based on parsing output and user input to finish initialization of the summary pipeline.

Computes chuncksize minimum and maximum according to the specifications of the LLM and the input text length. Calculating the maximal input length (max_input_chars) which should be summarized in one LLM call, otherwise the summary needs to perform another recursion of its reduce part. The desired summary length in chars is needed as reduce-loop criterion and to influence the length of the LLM response to the map and reduce prompt.

PARAMETER	DESCRIPTION
`warning_msg`	List of messages to the user, which regard the summary output. TYPE: `list[str]`
`language_model`	Name of the LLM as stated in llms.yml TYPE: `str`
`desired_summary_length`	Containing the information needed to compute the desired summary length from the two keys output_length (float) and input_length (int). TYPE: `dict`
`remaining_context_length`	Remaining context length after substracting all prompt lengths and a 30% buffer during max_input_chars estimation. TYPE: `int`
`topics`	User input text stating the topics the summary should focus on. TYPE: `str \| None`

RETURNS	DESCRIPTION
`SummaryParameters`	Parameters needed for summarization, which are computed according the users input. TYPE: `SummaryParameters`

RAISES	DESCRIPTION
`HTTPException`	HTTP_400_BAD_REQUEST raised if the selected language model was invalid.

Source code in docs/microservices/summary/src/summarizing/summary_registry.py

async def configure_summary_parameters(
    self,
    warning_msg: list[str],
    language_model: str,
    remaining_context_length: int,
    desired_summary_length: dict,
    topics: str | None,
) -> SummaryParameters:
    """Computes parameter based on parsing output and user input to finish initialization of the summary pipeline.

    Computes chuncksize minimum and maximum according to the specifications of the LLM and the input text length.
    Calculating the maximal input length (max_input_chars) which should be summarized in one LLM call,
    otherwise the summary needs to perform another recursion of its reduce part.
    The desired summary length in chars is needed as reduce-loop criterion and to influence the length of the LLM
    response to the map and reduce prompt.

    Args:
        warning_msg (list[str]): List of messages to the user, which regard the summary output.
        language_model (str): Name of the LLM as stated in llms.yml
        desired_summary_length (dict): Containing the information needed to compute the desired summary
                                       length from the two keys output_length (float) and input_length (int).
        remaining_context_length (int): Remaining context length after substracting all prompt lengths and a 30%
            buffer during max_input_chars estimation.
        topics (str | None): User input text stating the topics the summary should focus on.

    Returns:
        SummaryParameters: Parameters needed for summarization, which are computed according the users input.

    Raises:
        HTTPException: HTTP_400_BAD_REQUEST raised if the selected language model was invalid.
    """
    summary_pipeline = None
    messages = warning_msg

    # get summary pipeline for calculations
    for pipeline in self.summary_pipelines:
        if pipeline.llm_name == language_model:
            summary_pipeline = pipeline
            break

    if not summary_pipeline:
        logger.error(
            "Parameters can not be computed because there is no summary pipeline initialized, "
            f"which uses {language_model}."
        )
        raise HTTPException(
            status_code=status.HTTP_400_BAD_REQUEST,
            detail=(
                f"Es wurde ein ungültiges Sprachmodell ausgewählt: {language_model}."
                " Bitte versuchen Sie es mit einem anderen Sprachmodell."
            ),
        )

    # prepare topics and set them as focus for summarization
    if topics:
        topics = await summary_pipeline.process_topic_input(topics=topics)
    prompt_parameters = summary_pipeline.configure_prompt_parameters(topics=topics)

    # compute desired summary length and maximum of input characters
    summary_length_parameters = summary_pipeline.process_desired_summary_length(
        desired_summary_length=desired_summary_length
    )

    messages.extend(summary_length_parameters["messages"])

    max_input_chars = summary_pipeline.calculate_max_input_chars(
        desired_summary_chars=summary_length_parameters["desired_summary_chars"],
        remaining_context_length=remaining_context_length,
    )

    summary_parameters = SummaryParameters(
        messages=messages,
        desired_summary_chars=summary_length_parameters["desired_summary_chars"],
        max_input_chars=max_input_chars,
        prompt_parameters=prompt_parameters,
    )

    logger.debug(
        f"All parameters for the summarization have been computed: '{summary_parameters}'"
    )

    return summary_parameters

estimate_max_input_chars

estimate_max_input_chars(language_model)

Calculating the maximal number of characters of text input which should be summarized in one LLM call.

The calculation is based on the context length of the currently used LLM (stated in the llms.yml). The context length needs to encompass the length of the system-prompt, the length of the text input and the length of the summary as output.

A estimate of 30% for text input and summary output is used.

Furthermore the context window should not be maxed out in order to ensure high quality summaries (therefore we only use 70% the theoretically remaining max input length).

PARAMETER	DESCRIPTION
`language_model`	Name of the LLM as stated in llms.yml TYPE: `str`

RETURNS	DESCRIPTION
`dict`	containing a estimation of the maximal number of input characters for the current summary set up (which will be used to compute the ideal chunksize) and the remaining context length after substracting all prompt lengths and a 30% buffer (which will be used to refine the aximal number of input characters for further usage in the reduce loop of the summarization) TYPE: `int`

Source code in docs/microservices/summary/src/summarizing/summary_registry.py

def estimate_max_input_chars(
    self,
    language_model: str,
) -> int:
    """Calculating the maximal number of characters of text input which should be summarized in one LLM call.

    The calculation is based on the context length of the currently used LLM (stated in the llms.yml).
    The context length needs to encompass the length of the system-prompt, the length of the text input and
    the length of the summary as output.

    A estimate of 30% for text input and summary output is used.

    Furthermore the context window should not be maxed out in order to ensure high quality summaries (therefore we
    only use 70% the theoretically remaining max input length).

    Args:
        language_model (str): Name of the LLM as stated in llms.yml

    Returns:
        dict: containing a estimation of the maximal number of input characters for the current summary set up
            (which will be used to compute the ideal chunksize) and the remaining context length after substracting
            all prompt lengths and a 30% buffer (which will be used to refine the aximal number of input characters
            for further usage in the reduce loop of the summarization)
    """
    # get summary pipeline for calculations
    summary_pipeline = None
    for pipeline in self.summary_pipelines:
        if pipeline.llm_name == language_model:
            summary_pipeline = pipeline
            break
    if not summary_pipeline:
        logger.error(
            "Parameters can not be computed because there is no summary pipeline initialized, "
            f"which uses {language_model}."
        )
        raise HTTPException(
            status_code=status.HTTP_400_BAD_REQUEST,
            detail=(
                f"Es wurde ein ungültiges Sprachmodell ausgewählt: {language_model}."
                " Bitte versuchen Sie es mit einem anderen Sprachmodell."
            ),
        )

    max_prompt_length = max(
        (len(prompt.main) + len(prompt.focus_instructions))
        for prompt in [
            summary_pipeline.llm.prompt_config.system.map,
            summary_pipeline.llm.prompt_config.system.reduce,
            summary_pipeline.llm.prompt_config.system.final,
        ]
    )
    context_length_chars = (
        summary_pipeline.llm.context_length * 4
    )  # 1 token ~ 4 chars
    remaining_context_length = int(
        (context_length_chars - max_prompt_length)
        * 0.7  # cap at 70% context-length
    )

    max_input_chars = int(remaining_context_length * 0.7)
    logger.debug(
        f"The max_input_chars are set to {max_input_chars} = 70% of remaining_context_length "
        f"{remaining_context_length} with remaining_context_length = 70% of (LLMs maximal number of "
        f"input chars - the longest prompt length (map/reduce/final prompt))."
    )

    return {
        "max_input_chars_estimate": max_input_chars,
        "remaining_context_length": remaining_context_length,
    }

request_summary async

request_summary(summarize_input, text)

Start the summarization of chunks according to the given parameters.

PARAMETER	DESCRIPTION
`summarize_input`	Containing chunks incl. metadata, messages, prompt_parameters and LLM. TYPE: `SummarizeInput`
`text`	Parsed input file or text. TYPE: `str`

RETURNS	DESCRIPTION
`SummaryAPIOutput`	Contains the summary, a message to the user and the parsing output.

RAISES	DESCRIPTION
`HTTPException`	HTTP_424_FAILED_DEPENDENCY raised if the text could not be summarized due to missing chunks. HTTP_400_BAD_REQUEST raised if the selected language model was invalid.

Source code in docs/microservices/summary/src/summarizing/summary_registry.py

async def request_summary(
    self,
    summarize_input: SummarizeInput,
    text: str,
) -> SummaryAPIOutput:
    """Start the summarization of chunks according to the given parameters.

    Args:
        summarize_input (SummarizeInput): Containing chunks incl. metadata, messages, prompt_parameters and LLM.
        text (str): Parsed input file or  text.

    Returns:
        Contains the summary, a message to the user and the parsing output.

    Raises:
        HTTPException:  HTTP_424_FAILED_DEPENDENCY raised if the text could not be summarized due to missing chunks.
                        HTTP_400_BAD_REQUEST raised if the selected language model was invalid.
    """
    # errorhandling for missing chunks
    if not summarize_input.chunks:
        logger.error(
            "No summarization possible because there are no chunks to summarize."
        )
        raise HTTPException(
            status_code=status.HTTP_424_FAILED_DEPENDENCY,
            detail=(
                "Der Text konnte nicht verarbeitet werden. Bitte versuchen Sie es mit einer anderen Datei."
            ),
        )

    # select and start pipeline
    summary_result = {}
    for pipeline in self.summary_pipelines:
        if pipeline.llm_name == summarize_input.language_model:
            summary_pipeline = pipeline
            summary_result = await summary_pipeline.summarize(
                summarize_input=summarize_input,
            )
            break

    # errorhandling for missing pipeline
    if not summary_pipeline:
        logger.error(
            f"Warning: Invalid 'language_model' requested: {summarize_input.language_model}."
        )
        raise HTTPException(
            status_code=status.HTTP_400_BAD_REQUEST,
            detail=(
                f"Mit dem ausgewählten Sprachmodell {summarize_input.language_model} konnte keine Zusammenfassung "
                "generiert werden. Bitte versuchen Sie es mit einem anderen Sprachmodell."
            ),
        )

    return SummaryAPIOutput(
        summary=summary_result["summary"],
        warning_msg=" ".join(set(summary_result["messages"])),
        parsed_text=text,
    )

utils

Utils functions for logging, LLM availability check, LLM authentication and configuration processing.

MODULE	DESCRIPTION
`base_logger`	Set up the root logger for the entire application. This logger will log messages to the console and a file.
`check_model_api_availability`	This module provides functions to check LLM-APIs for availability.
`openai_custom_auth`	Customized Httpx Authentication Client.
`process_configs`	Methods to load and config and start checks of config integrity.

base_logger

Set up the root logger for the entire application. This logger will log messages to the console and a file.

FUNCTION	DESCRIPTION
`setup_logger`	Initializes the logger with the desired log level and add handlers.

setup_logger

setup_logger()

Initializes the logger with the desired log level and add handlers.

Sets up the root logger, which all other loggers inherit from. Adds file, console and exit handlers to the logger and sets the format.

Source code in docs/microservices/summary/src/utils/base_logger.py

def setup_logger() -> None:
    """Initializes the logger with the desired log level and add handlers.

    Sets up the root logger, which all other loggers inherit from.
    Adds file, console and exit handlers to the logger and sets the format.
    """
    logger = logging.getLogger()

    # create different handlers for log file and console
    file_handler = logging.handlers.RotatingFileHandler(
        filename=settings.log_file,
        maxBytes=settings.log_file_max_bytes,
        backupCount=settings.log_file_backup_count,
    )
    console_handler = logging.StreamHandler()

    # define log format and set for each handler
    formatter = logging.Formatter(
        fmt="%(asctime)s - %(levelname)8s - %(module)s - %(funcName)s: %(message)s",
        datefmt="%Y-%m-%d %H:%M:%S%z",
    )
    file_handler.setFormatter(formatter)
    console_handler.setFormatter(formatter)

    # add handlers to the logger
    logger.addHandler(file_handler)
    logger.addHandler(console_handler)

    logger.setLevel(settings.log_level)

check_model_api_availability

This module provides functions to check LLM-APIs for availability.

To check a certain LLM use await check_model_api(llm). To get all LLMs that are activated in configs/general.yml, use await get_available_llms().

FUNCTION	DESCRIPTION
`get_available_llms`	Returns a list of available LLMs.
`is_model_api_available`	Checks if API is available using credentials.

get_available_llms `async`

get_available_llms()

Returns a list of available LLMs.

RETURNS	DESCRIPTION
`list[dict[str, Any]]`	List of available LLMs with selected infos

Source code in docs/microservices/summary/src/utils/check_model_api_availability.py

async def get_available_llms() -> list[dict[str, Any]]:
    """Returns a list of available LLMs.

    Returns:
        List of available LLMs with selected infos
    """
    available_llms = []

    # iterate over model_groups (services), i.e. chat, RAG, embedding, ...
    for model_group_key in llm_config:
        logger.debug(f"Checking APIs for {model_group_key}-LLMs.")
        model_group = llm_config[model_group_key]

        for llm_name, llm in model_group.items():
            logger.debug(f"Checking availability of {llm_name}")
            if await is_model_api_available(llm.api, llm_name):
                llm_dict = llm.model_dump(include=["label", "is_remote"])
                llm_dict["name"] = llm_name

                available_llms.append(llm_dict)

    return available_llms

is_model_api_available `async`

is_model_api_available(llm_api, llm_name, timeout_in_s=10)

Checks if API is available using credentials.

Availability is checked by sending a HEAD, GET, or POST request. If a health_check endpoint is provided, the request is sent to that endpoint; otherwise, it is sent to the main API URL.

PARAMETER	DESCRIPTION
`llm_api`	The LLMAPI instance to check. TYPE: `LLMAPI`
`llm_name`	ID of the LLM as used in the config file as reference. TYPE: `str`
`timeout_in_s`	Http timeout in seconds; defaults to 10. TYPE: `int` DEFAULT: `10`

RETURNS	DESCRIPTION
`bool`	Whether the model API is available or not.

Source code in docs/microservices/summary/src/utils/check_model_api_availability.py

async def is_model_api_available(
    llm_api: LLMAPI,
    llm_name: str,
    timeout_in_s: int = 10,
) -> bool:
    """Checks if API is available using credentials.

    Availability is checked by sending a HEAD, GET, or POST request. If a health_check endpoint is provided,
    the request is sent to that endpoint; otherwise, it is sent to the main API URL.

    Args:
        llm_api (LLMAPI): The LLMAPI instance to check.
        llm_name (str): ID of the LLM as used in the config file as reference.
        timeout_in_s (int): Http timeout in seconds; defaults to 10.

    Returns:
        Whether the model API is available or not.
    """
    headers = {"Content-type": "application/json"}

    # Authorization is not always needed
    if llm_api.auth:
        headers["Authorization"] = llm_api.auth.get_auth_header()

    url = llm_api.get_health_check_url()

    # test health check endpoint with HEAD, GET and POST
    try:
        async with httpx.AsyncClient() as client:
            response = await client.get(
                url,
                headers=headers,
                timeout=timeout_in_s,
            )
        logger.debug(
            f"{url} health check via GET request: {response.status_code=}, LLM: '{llm_name}"
        )

        # test with POST
        if response.status_code != HTTPStatus.OK:
            async with httpx.AsyncClient() as client:
                response = await client.post(
                    url,
                    headers=headers,
                    timeout=timeout_in_s,
                )
            logger.debug(
                f"{url} health check via POST request: {response.status_code=}, LLM: '{llm_name}"
            )

    except Exception as e:
        logger.warning(
            f"Exception when trying to reach LLM API. Error: {e}, LLM: '{llm_name}"
        )
        return False

    if response.status_code != HTTPStatus.OK:
        logger.warning(
            f"LLM unavailable: Could not establish connection to LLM-API. LLM: '{llm_name}"
        )

    return response.status_code == HTTPStatus.OK

openai_custom_auth

Customized Httpx Authentication Client.

CLASS	DESCRIPTION
`CustomAuthClient`	Custom HTTP transport for OpenAI client.

CustomAuthClient

Bases: AsyncClient

Custom HTTP transport for OpenAI client.

This class supports both Bearer Token Authentication and Basic Authentication. If auth_type is 'token', the secret is expected to be the API key. If auth_type is 'basic_auth', the secret is expected to be a base64-encoded string of 'username:password'.

ATTRIBUTE	DESCRIPTION
`auth_header`	Authentication header for the httpx client. TYPE: `str`

METHOD	DESCRIPTION
`a_send`	Asynchronous method for sending HTTP requests.
`send`	Synchronous method for sending HTTP requests.

Source code in docs/microservices/summary/src/utils/openai_custom_auth.py

class CustomAuthClient(httpx.AsyncClient):
    """Custom HTTP transport for OpenAI client.

    This class supports both Bearer Token Authentication and Basic Authentication.
    If `auth_type` is 'token', the `secret` is expected to be the API key.
    If `auth_type` is 'basic_auth', the `secret` is expected to be a base64-encoded string of 'username:password'.

    Attributes:
        auth_header (str): Authentication header for the httpx client.

    Methods:
        a_send(request, *args, **kwargs): Asynchronous method for sending HTTP requests.
        send(request, *args, **kwargs): Synchronous method for sending HTTP requests.
    """

    def __init__(
        self,
        secret: str | None = None,
        auth_type: Literal["token", "basic_auth"] | None = None,
        *args: object,
        **kwargs: object,
    ) -> None:
        """Initializes the custom HTTP transport for OpenAI client.

        Initialization expects authentication per 'token' (here the `secret` is expected to be the OpenAI API key)
        or 'basic_auth'(here the `secret` is expected to be a base64-encoded string of 'username:password').

        Args:
            secret (str | None): OpenAI API Key or Basic Auth credentials (username:password).
                This is required depending on the `auth_type`. If `auth_type` is 'token', the `secret` should be
                the API key. If `auth_type` is 'basic_auth', the `secret` should be a base64-encoded string of
                'username:password'.
            auth_type (Literal | None): The type of authentication to use. It can be 'token' or 'basic_auth'.
            *args: Variable length argument list.
            **kwargs: Arbitrary keyword arguments.

        Raises:
            ValueError: If `auth_type` is provided but `secret` is not provided.
        """
        super().__init__(*args, **kwargs)

        self.auth_header = ""

        if auth_type and not secret:
            raise ValueError("API credentials are required but missing.")

        if auth_type == "token":
            self.auth_header = f"Bearer {secret}"

        elif auth_type == "basic_auth":
            encoded_credentials = base64.b64encode(secret.encode()).decode()
            self.auth_header = f"Basic {encoded_credentials}"

    async def a_send(
        self,
        request: httpx.Request,
        *args: object,
        **kwargs: object,
    ) -> httpx.Response:
        """Asynchronous version of the send method to handle requests asynchronously.

        Args:
            request (httpx.Request): Request to send asynchronously.
            *args: Variable length argument list.
            **kwargs: Arbitrary keyword arguments.

        Returns:
            Response to the request.
        """
        if "Authorization" in request.headers:
            del request.headers["Authorization"]
        if self.auth_header:
            request.headers["Authorization"] = self.auth_header
        return await super().a_send(request, *args, **kwargs)

    def send(
        self,
        request: httpx.Request,
        *args: object,
        **kwargs: object,
    ) -> httpx.Response:
        """Version of the send method to handle requests synchronously.

        Args:
            request (httpx.Request): Request to send synchronously.
            *args: Variable length argument list.
            **kwargs: Arbitrary keyword arguments.

        Returns:
            Response to the request.
        """
        if "Authorization" in request.headers:
            del request.headers["Authorization"]
        if self.auth_header:
            request.headers["Authorization"] = self.auth_header
        return super().send(request, *args, **kwargs)

a_send async

a_send(request, *args, **kwargs)

Asynchronous version of the send method to handle requests asynchronously.

PARAMETER	DESCRIPTION
`request`	Request to send asynchronously. TYPE: `Request`
`*args`	Variable length argument list. TYPE: `object` DEFAULT: `()`
`**kwargs`	Arbitrary keyword arguments. TYPE: `object` DEFAULT: `{}`

RETURNS	DESCRIPTION
`Response`	Response to the request.

Source code in docs/microservices/summary/src/utils/openai_custom_auth.py

async def a_send(
    self,
    request: httpx.Request,
    *args: object,
    **kwargs: object,
) -> httpx.Response:
    """Asynchronous version of the send method to handle requests asynchronously.

    Args:
        request (httpx.Request): Request to send asynchronously.
        *args: Variable length argument list.
        **kwargs: Arbitrary keyword arguments.

    Returns:
        Response to the request.
    """
    if "Authorization" in request.headers:
        del request.headers["Authorization"]
    if self.auth_header:
        request.headers["Authorization"] = self.auth_header
    return await super().a_send(request, *args, **kwargs)

send

send(request, *args, **kwargs)

Version of the send method to handle requests synchronously.

PARAMETER	DESCRIPTION
`request`	Request to send synchronously. TYPE: `Request`
`*args`	Variable length argument list. TYPE: `object` DEFAULT: `()`
`**kwargs`	Arbitrary keyword arguments. TYPE: `object` DEFAULT: `{}`

RETURNS	DESCRIPTION
`Response`	Response to the request.

Source code in docs/microservices/summary/src/utils/openai_custom_auth.py

def send(
    self,
    request: httpx.Request,
    *args: object,
    **kwargs: object,
) -> httpx.Response:
    """Version of the send method to handle requests synchronously.

    Args:
        request (httpx.Request): Request to send synchronously.
        *args: Variable length argument list.
        **kwargs: Arbitrary keyword arguments.

    Returns:
        Response to the request.
    """
    if "Authorization" in request.headers:
        del request.headers["Authorization"]
    if self.auth_header:
        request.headers["Authorization"] = self.auth_header
    return super().send(request, *args, **kwargs)

process_configs

Methods to load and config and start checks of config integrity.

FUNCTION	DESCRIPTION
`load_all_configs`	Loads config settings from respective paths.
`load_from_yml_in_pydantic_model`	Loads config from 'list_of_yaml_paths' into given pydantic-Model.
`load_yaml`	Loads yaml files.
`merge_specific_cfgs_in_place`	Copies prompt-config to appropriate section in general llm_config.
`postprocess_configs`	Post-Processes loaded configs.
`remove_unavailable_models`	Removes models from all useacases, if they are not in 'active_models'.

load_all_configs

load_all_configs(general_config_paths, path_to_llm_prompts, path_to_llm_model_configs)

Loads config settings from respective paths.

PARAMETER	DESCRIPTION
`general_config_paths`	Path to config, matching 'Settings'. TYPE: `Path`
`path_to_llm_prompts`	Path to config, matching 'LLMPromptMaps'. TYPE: `Path`
`path_to_llm_model_configs`	Path to config, matching 'LLMConfig'. TYPE: `Path`

RETURNS	DESCRIPTION
`tuple[Settings, LLMConfig]`	Config loaded into their Pydantic Model.

Source code in docs/microservices/summary/src/utils/process_configs.py

def load_all_configs(
    general_config_paths: Path,
    path_to_llm_prompts: Path,
    path_to_llm_model_configs: Path,
) -> tuple[Settings, LLMConfig]:
    """Loads config settings from respective paths.

    Args:
        general_config_paths (Path): Path to config, matching 'Settings'.
        path_to_llm_prompts (Path): Path to config, matching 'LLMPromptMaps'.
        path_to_llm_model_configs (Path): Path to config, matching 'LLMConfig'.

    Returns:
        Config loaded into their Pydantic Model.

    """
    settings = load_from_yml_in_pydantic_model(general_config_paths, Settings)
    llm_prompts = load_from_yml_in_pydantic_model(path_to_llm_prompts, LLMPromptMaps)
    llm_config = load_from_yml_in_pydantic_model(path_to_llm_model_configs, LLMConfig)

    postprocess_configs(settings, llm_prompts, llm_config)

    return settings, llm_config

load_from_yml_in_pydantic_model

load_from_yml_in_pydantic_model(yaml_path, pydantic_reference_model)

Loads config from 'list_of_yaml_paths' into given pydantic-Model.

PARAMETER	DESCRIPTION
`yaml_path`	Yaml to load. TYPE: `Path`
`pydantic_reference_model`	Pydantic model to load yaml into. TYPE: `BaseModel`

RETURNS	DESCRIPTION
`BaseModel`	BaseModel derived Pydantic data class.

RAISES	DESCRIPTION
`ValidationError`	Raised in case of an invalid configurations from the yaml file.

Source code in docs/microservices/summary/src/utils/process_configs.py

def load_from_yml_in_pydantic_model(
    yaml_path: Path, pydantic_reference_model: BaseModel
) -> BaseModel:
    """Loads config from 'list_of_yaml_paths' into given pydantic-Model.

    Args:
        yaml_path (Path): Yaml to load.
        pydantic_reference_model (BaseModel): Pydantic model to load yaml into.

    Returns:
        BaseModel derived Pydantic data class.

    Raises:
        ValidationError: Raised in case of an invalid configurations from the yaml file.
    """
    data = load_yaml(yaml_path)

    try:
        pydantic_class = pydantic_reference_model(**data)
        logger.info(f"Config loaded from: '{yaml_path}'")
        return pydantic_class

    except ValidationError as e:
        logger.critical(f"Error loading config: '{e}'")
        raise HTTPException(
            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
            detail=(
                "Die Zusammenfassungsfunktion steht aufgrund eines technischen Fehlers derzeit nicht zur "
                "Verfügung. "
            ),
        )

load_yaml

load_yaml(yaml_path)

Loads yaml files.

PARAMETER	DESCRIPTION
`yaml_path`	Path to yaml. TYPE: `list[Path]`

RETURNS	DESCRIPTION
`dict[str, Any]`	Content of loaded yaml.

RAISES	DESCRIPTION
`FileNotFoundError`	Raised in case of an invalid path to the yaml file.

Source code in docs/microservices/summary/src/utils/process_configs.py

def load_yaml(yaml_path: Path) -> dict[str, Any]:
    """Loads yaml files.

    Args:
        yaml_path (list[Path]): Path to yaml.

    Returns:
        Content of loaded yaml.

    Raises:
        FileNotFoundError: Raised in case of an invalid path to the yaml file.
    """
    if not yaml_path.exists():
        logger.error(f"Invalid path: '{yaml_path}'")
        raise FileNotFoundError

    with open(yaml_path) as file:
        return yaml.safe_load(file)

merge_specific_cfgs_in_place

merge_specific_cfgs_in_place(llm_config, llm_prompts)

Copies prompt-config to appropriate section in general llm_config.

Only if 'prompt_map' in LLMConfig can be found in LLMPromptMaps, it will be merged. i.e. try to generalize sth. like this:

cfg["phi3:mini"].prompt_config = prompt[cfg["phi3:mini"].prompt_map]

PARAMETER	DESCRIPTION
`llm_config`	Target for merge of Prompt parameter. TYPE: `LLMConfig`
`llm_prompts`	Source to merge Prompt parameter from. TYPE: `LLMPromptMaps`

RETURNS	DESCRIPTION
`bool`	True if no problems occurred.

Source code in docs/microservices/summary/src/utils/process_configs.py

def merge_specific_cfgs_in_place(
    llm_config: LLMConfig, llm_prompts: LLMPromptMaps
) -> bool:
    """Copies prompt-config to appropriate section in general llm_config.

    Only if 'prompt_map' in LLMConfig can be found in LLMPromptMaps, it will be merged.
    i.e. try to generalize sth. like this:

    cfg["phi3:mini"].prompt_config = prompt[cfg["phi3:mini"].prompt_map]

    Args:
        llm_config (LLMConfig): Target for merge of Prompt parameter.
        llm_prompts (LLMPromptMaps): Source to merge Prompt parameter from.

    Returns:
        True if no problems occurred.

    """
    no_issues_occurred = True
    for usecase in llm_config:
        # load identical usecases, i.e. chat, RAG
        try:
            cfg = getattr(llm_config, usecase)
            prompt = getattr(llm_prompts, usecase)
        except AttributeError:
            logger.warning(
                f"Usecase '{usecase}' not matching between prompt- and general llm config. \
                    Skipping cfg-merge for '{usecase}' .."
            )
            no_issues_occurred = False
            continue

        # copy prompt config to its usecase- and model-counterpart
        for model in cfg:
            prompt_map_to_use = cfg[model].prompt_map
            if prompt_map_to_use in prompt:
                cfg[model].prompt_config = prompt[prompt_map_to_use]
            else:
                logger.warning(
                    f"'prompt_map: {prompt_map_to_use}' from LLM-config not in prompt-config for '{usecase}'. \
                        Skipping .."
                )
                no_issues_occurred = False
                continue

    return no_issues_occurred

postprocess_configs

postprocess_configs(settings, llm_prompts, llm_config)

Post-Processes loaded configs.

Remove unused models (from settings.active_models), merge LLMPromptMaps into LLMConfig.

PARAMETER	DESCRIPTION
`settings`	Config matching pydantic 'Settings'. TYPE: `Settings`
`llm_prompts`	Config matching pydantic 'LLMPromptMaps'. TYPE: `LLMPromptMaps`
`llm_config`	Config matching pydantic 'LLMConfig'. TYPE: `LLMConfig`

RETURNS	DESCRIPTION
`LLMConfig`	Merged and filtered LLM configuration.

Source code in docs/microservices/summary/src/utils/process_configs.py

def postprocess_configs(
    settings: Settings, llm_prompts: LLMPromptMaps, llm_config: LLMConfig
) -> LLMConfig:
    """Post-Processes loaded configs.

    Remove unused models (from settings.active_models), merge LLMPromptMaps into LLMConfig.

    Args:
        settings (Settings): Config matching pydantic 'Settings'.
        llm_prompts (LLMPromptMaps): Config matching pydantic 'LLMPromptMaps'.
        llm_config (LLMConfig): Config matching pydantic 'LLMConfig'.

    Returns:
        Merged and filtered LLM configuration.
    """
    remove_unavailable_models(llm_config, settings.active_llms)
    merge_specific_cfgs_in_place(llm_config, llm_prompts)

    return llm_config

remove_unavailable_models

remove_unavailable_models(input_config, active_models)

Removes models from all useacases, if they are not in 'active_models'.

PARAMETER	DESCRIPTION
`input_config`	Config to change. TYPE: `LLMConfig`
`active_models`	Models to keep - remove other. TYPE: `list[str]`

RETURNS	DESCRIPTION
`None`	None

Source code in docs/microservices/summary/src/utils/process_configs.py

def remove_unavailable_models(
    input_config: LLMConfig, active_models: list[str]
) -> None:
    """Removes models from all useacases, if they are not in 'active_models'.

    Args:
        input_config (LLMConfig): Config to change.
        active_models (list[str]): Models to keep - remove other.

    Returns:
        None

    """
    for usecase in input_config:
        cfg = getattr(input_config, usecase)
        available_models_for_usecase = getattr(active_models, usecase)
        for model in list(cfg):
            if model not in available_models_for_usecase:
                cfg.pop(model)

Summary

summary

main

src

app

lifespan async

endpoints

get_llms async

health async

summarize_file async

summarize_text async

input_handling

parsing_request async

request_handling async

models

api_input

SummaryFileAPIInputParameters

SummaryTextAPIInput

api_output

SummaryAPIOutput

general

ActiveLLMs

InterServiceCommunication

LogLevel

PostConfig

Settings

graph_state

OverallState

SummaryState

llms

APIAuth

LLM

LLMAPI

LLMConfig

LLMInference

LLMPromptComponents

LLMPromptConfig

LLMPromptMaps

LLMPrompts

parser_output

Chunk

ChunkMetadata

ParsingOutput

summary_parameters

PromptParameters

SummarizeInput

SummaryParameters

settings

summarizing

summary

Summary

summary_registry

SummaryRegistry

utils

base_logger

setup_logger

check_model_api_availability

get_available_llms async

is_model_api_available async

openai_custom_auth

CustomAuthClient

process_configs

load_all_configs

load_from_yml_in_pydantic_model

load_yaml

merge_specific_cfgs_in_place

postprocess_configs

remove_unavailable_models

lifespan `async`

get_llms `async`

health `async`

summarize_file `async`

summarize_text `async`

parsing_request `async`

request_handling `async`

get_available_llms `async`

is_model_api_available `async`