Hi Alice,
thanks for bringing up this topic. Coincidentally, I've been (vaguely) planning to work on this topic as well. On the phone version of Sculpt, I'd like to give the user an easy way to uninstall software (including the implicitly installed dependencies).
I want to create a new component, 'depot_autoremove', that would be executed by the 'depot_download_manager' after the 'extract' step as an optional stage. The 'depot_autoremove' component will consume a config, the installation config, to identify a set of packages to keep installed. It will remove any other packages and orphan dependencies not part of that set.
I think that the removal of depot content is not related to the 'depot_download_manager'. It can be considered as an independent problem. Since the removal of depot content does not involve any networking and does not process any (potentially dangerous) data (like extracting archives), we can get away with a simple component that merely scans the depot and removes content using plain file operations. So we don't need any fine-grained sandboxing as done by the depot-download subsystem.
Let me share my thoughts what I would expect from a depot-uninstall component. In general, it would perform two steps.
1. It would remove a selection of depot packages according to its configuration. With depot package, I only refer to pkg/<version>/ directories.
2. It would garbage-collect all depot content that is no longer referenced by any pkg present in the depot. I guess this is what you had in mind with the 'depot_autoremove' naming.
The pkg-removal step raises the question of how to specify the set of packages to remove. You suggested specifying a list of pkgs to keep and discard everything not featured in this list. In other situations, like when using Sculpt with a large disk and preserving the ability to roll the system back to any previous version, it would be more appropriate to explicitly select the packages to remove - letting the user take interactive decisions.
I think the <config> of the depot-uninstall tool could accommodate both situations quite well. E.g.,
Remove one specific version of a pkg:
<config> <remove user="cnuke" pkg="pdf_view" version="2022-02-22"/> ... </config>
Remove all versions of a pkg:
<config> <remove user="cnuke" pkg="pdf_view"/> ... </config>
Remove all pkgs of a specified depot user:
<config> <remove user="cnuke"/> ... </config>
Remove all pkgs except for an explicit selection of packages to keep:
<config> <remove-all> <keep user="cnuke" pkg="pdf_view"/> </remove-all> ... </config>
The <keep> node could also give the freedom to select a particular version or a whole user.
Would that configuration interface satisfy your needs?
The second (garbage-collection) step would collect the content of all 'pkg/<version>/archive' files found in the depot - the remaining "working set" of pkg dependencies so to say. With the working set determined, it would traverse all src/, bin/, and raw/ directories, look if the respective directory is part of the working set, and remove it otherwise.
For the directory traversal and file operations, it may be useful to take the implementation of the depot_query and fs_tool components as inspiration.
There is one open question, though: A pkg archive file can refer to other pkgs, which are implicitly installed. It would be nice to include such implicitly installed pkgs in the garbage connection. In order to do so, we would need to slightly enhance the depot-download mechanism to annotate the way of how each pkg entered the depot. E.g., for all pkgs explicitly specified in the <installation>, we could add an empty file 'selected' inside the pkg. All pkgs without such an annotation would then be included in the garbage collection.
The feature would be a very welcome addition.
Do you think that the rough plan above is sensible?
Cheers Norman