Build/launch private repositories as the logged-in user (within an authenticated binderhub instance)
See original GitHub issueThis issue is related to #1117, and possibly also to https://github.com/alan-turing-institute/hub23-deploy/issues/272 See also https://discourse.jupyter.org/t/binderhub-with-private-gitlab-and-user-scopes/3502
Proposed change
Make it possible to fill repo provider / git credentials for a forge (e.g. gitlab instance) dynamically using the currently logged-in user information. This would help set up binderhub instances for use with private repositories for organizations that run a forge (e.g. gitlab instance).
Alternative options
The alternative option today is to have a ‘technical’ user (e.g. ‘binderhub’) on the forge (e.g. gitlab instance) that has (at least) read access to repositories. Then, a personal access token is created for that technical user and passed to the binderhub via the configuration system in GitLabRepoProvider.private_token
. This token will be used to pull all repositories for all users.
This is not ideal:
- from a usability perspective, it requires the user to grant access to their repos and revoke it as needed, and involves several steps that need to be documented on the private binderhub instance
- from a security perspective, I am unsure of the implications. The project access level that has to be granted to the technical user are sometimes higher than a simple read access (e.g. on gitlab, I believe that the ‘reporter’ access level is required at least) – but even when running as the logged-in user, binder may require capabilities that are not strictly necessary because of the somewhat coarse grain of the gitlab permissions model.
Who would use this feature?
Organizations who would like to run a private binderhub for their teams along with a forge (e.g. gitlab instance) and who want to build and launch private repositories.
(Optional): Suggest a solution
Since the repo provider will have to resolve private refs using the currently logged-in user identity, it might be useful to pass user information to the repo providers, at construction time. I’m unsure how clean it is design-wise (and looking for feedback/comments!) but an option might be to pass the handler to RepoProviders at init time.
Then, assuming that a repo provider can access the current_user
for the handler, a sketch implementation of a private repo provider for a gitlab instance might look like this:
class AuthGitLabProvider(GitLabRepoProvider):
def __init__(self, *args, handler, **kwargs):
super().__init__(*args, **kwargs)
self.handler = handler
ud = self.get_user_data(self.handler.get_current_user()['name'])
self.access_token = ud['auth_state']['access_token']
def get_user_data(self, username):
r = requests.get(c.HubOAuth.api_url + f'/users/{username}',
headers={
'Authorization': 'token %s' % c.HubOAuth.api_token,
}
)
r.raise_for_status()
return r.json()
@default('git_credentials')
def _default_git_credentials(self):
if self.access_token:
return r'username=oauth2\npassword={token}'.format(token=self.access_token)
return ""
c.BinderHub.repo_providers = {'gl': AuthGitLabProvider}
c.GitLabRepoProvider.hostname = "gitlab.example.com"
Of couse the sketch implementation here is just for discussion, has blocking calls, etc. (and obviously the api_token
and api_url
need to be fetched properly somehow)
But I’d like to gather feedback on the design and possible issues?
Thanks!
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (5 by maintainers)
@jtpio Yes pretty much. The only difference with my diff is that I only changed
base.py
and kept theget_provider
function signature unmodified, so the diff would beWhat this means at first glance, is that to enable this use case there would be a change similar to this in
binderhub
:cc @rprimet feel free to correct if the diff is different based on your local testing.
And this logic can indeed stay in the config outside of BinderHub.