Extending Renard

Creating new steps

Usually, steps must implement at least four functions :

Here is an example of creating a basic tokenization step :

from typing import Dict, Any, Set
from renard.pipeline.core import PipelineStep

class BasicTokenizerStep(PipelineStep):

    def __init__(self):
        pass

    def __call__(self, text: str, **kwargs) -> Dict[str, Any]:
        return {"tokens": text.split(" ")}

    def needs(self) -> Set[str]:
        return {"text"}

    def production(self) -> Set[str]:
        return {"tokens"}

Additionally, the following methods can be overridden: