Hi,
That link is broken for me - is this one the same thing?
hsds/docs/docker_install_aws.md at master · HDFGroup/hsds (github.com)
This is working nicely so far. A couple of notes along the way:
-
I had thought that hsds could be used through pip. Maybe it can, but it appears just to be the code, without all of the docker and configuration stuff that the git clone download arrives with. I’m going with the latter this time as per instructions.
-
Item 5 says to run setup.py but there isn’t a file with this name. I ran ./build.sh, which appeared to do a sensible thing.
-
On Amazon Linux, we need yum rather than apt to install things like docker.
-
I needed pyflakes to be installed for build.sh to run. I had a python venv with pip so just added pyflakes to that and activated it before running build.sh.
-
I used this page to install docker-compose on Amazon Linux:
Amazon Linux 2 - install docker & docker-compose using ‘sudo amazon-linux-extras’ command (github.com)
HSDS Runs!
I can follow the server log output to see what is happening with:
docker logs --follow hsds-sn-1
The challenge now is to configure my running HSDS server on EC2 so that it reads from the pre-existing s3 bucket as a the data back end.
I’ve added the following in hsds/admin/config/override.yml
aws_s3_gateway: https://s3.us-west-2.amazonaws.com
aws_region: us-west-2
default_public: True
bucket_name: nrel-pds-wtk
greeting: If you see this, the override is working!
aws_s3_no_sign_request: True
is it working?
It looks like there are two buckets: nrel-pds-wtk, and nrel-pds-hsds. I had thought that the hsds one would be right, but if I run hsls on this, it just locks up the server. Is this just because the data is divided into a zillion chunks?
The next attempt is to try nrel-pds-wtk as the bucket, as configured in override.yml.
The result from this is that hsls just returns / and nothing else, and even hsls -r doesn’t return any more, so no dice.
However, if I look in the server logs, it is suddenly listing out loads of interesting things, /Great_Lakes, /Hawaii and so on. All of them have a
WARN> fetch result - not found error for: /Offshore_CA
and a
WARN> get_domains - domain: /Great_Lakes not found in crawler dict
Progress report
HSDS now running nicely in a docker setup on EC2 in us-west-2.
If you have any advice on how to configure it to connect to the right bucket, so that it can successfully read e.g. the 100m layer of 2km resolution windspeed grid in the US I’d be hugely appreciative. I’m stumped so far, but feel like I’m close.