I run some code directly or in 2.0b7 it does 30+ iterations a second completing in a few seconds. Yet on 2.0b10 it is massively slower at around 8 seconds per iteration. I posted similar issue on 2.0b7 when using ConcurrentTaskRunner. It worked fine switching to SequentialTaskRunner but on the the new release this is also really really slow.
Example code is below. I will try to create example that does not include spacy. However I just wonder if there is a simple explanation and I just need to tweak my code or settings for the new release?:
import logging
import os
os.environ.update(
PREFECT_API_URL="http://127.0.0.1:4200/api",
)
log = logging.getLogger(__name__)
from prefect import flow, task
from prefect.task_runners import SequentialTaskRunner, ConcurrentTaskRunner
from tqdm.auto import tqdm
import spacy
@task
def task1():
log.warning("running the task")
nlp = spacy.load("en_core_web_sm")
log.warning("loaded data")
out = [nlp(x) for x in tqdm(["the cat sat on the mat." * 20] * 100)]
@flow(task_runner=SequentialTaskRunner())
def flow1():
log.warning("running the flow ******** ")
task1()
if __name__ == "__main__":
task1.fn()
log.warning("completed raw function")
flow1()
Would be useful if possible to detect this issue then raise an error rather than just hanging. If not then put a note in the docs. It is not obvious what restrictions are necessary e.g. threads, processes, async functions…
Unsure why it worked fine in the previous version but not in the latest - something changed I guess.
I was going to say that an alternative solution might be to use prefect2 tags to force prefect2 to execute some tasks without multiprocessing. However that may not work because this issue still occurs with SequentialTaskRunner even though that is presumably not using any multiprocessing?
This worked with spacy. However when I run a pytorch model I have the same issue. It worked in previous version of prefect2 but in the new version it is very slow even with SequentialTaskRunner running just a single task.