Closed Bug 1263955 Opened 9 years ago Closed 8 years ago

Presto should limit the amount of memory used to allow other things to run on the instance

Categories

(Cloud Services Graveyard :: Metrics: Pipeline, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: azhang, Assigned: robotblake)

References

Details

(Whiteboard: [SvcOps])

When a query is running in the Presto machine [1], it uses almost all of the available memory. When this happens, Parquet2Hive doesn't have enough, and it fails with: > Failure to parse dataset, 'NoneType' object has no attribute 'group' What we probably want to do is to limit the memory used by Presto. [1]: hadoop@ec2-54-218-5-112.us-west-2.compute.amazonaws.com
Flags: needinfo?(whd)
The current Presto configuration uses a large chunk of the available memory on that instance [1]. The celery queue of re:dash can also consume a large amount of memory and I am not sure if that's bounded somehow. When that happens nearly no memory is left for other processes. We should deploy the re:dash service on its own independent instance; that would also make it easier to redeploy Presto with a new configuration. [1] https://github.com/vitillo/emr-bootstrap-presto/blob/master/ansible/files/telemetry.sh#L54
Whiteboard: [SvcOps]
reassigning to Travis in light of whd's availability.
Flags: needinfo?(tblow)
Assignee: nobody → bimsland
Points: --- → 2
Priority: -- → P2
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
Flags: needinfo?(whd)
Flags: needinfo?(tblow)
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.