hespas.estimator.roofline_estimator
Classes
|
Exceptions
- exception hespas.estimator.roofline_estimator.RooflineMissingDatatypeError
Bases:
Exception
- class hespas.estimator.roofline_estimator.RooflineEstimator(hw_config=None, **kwargs)
Bases:
ComputeEstimator- allow_multiprocess = True
- peak_flops = <hespas.estimator.config_option.ConfigOption object>
- memory_bandwidth = <hespas.estimator.config_option.ConfigOption object>
- tdp_W = <hespas.estimator.config_option.ConfigOption object>
- hbm_power_ratio = <hespas.estimator.config_option.ConfigOption object>
- per_datatype_flops = <hespas.estimator.config_option.ConfigOption object>
- warn_on_unknown_type = <hespas.estimator.config_option.ConfigOption object>
- error_on_unknown_type = <hespas.estimator.config_option.ConfigOption object>
- __get_datatype_str(datatype)
- __get_datatype_str_by_op(op_info)
- __get_flops_by_datatype_str(datatype_str)
- __get_flops_by_datatype(datatype)
- TENSOR_CORE_OPS = frozenset({'stablehlo.convolution', 'stablehlo.dot_general'})
- TENSOR_CORE_PROMOTIONS = {'f32': 'tf32'}
- __get_flops_by_op(op_info)
- __get_flops(op_info)
- compute_runtime(op_info, flops, bytes_accessed)
- __add_roofline_stats(stats_tree)
- __setup_roofline_stats()
- __setup_module_roofline_stats(module)
- __setup_per_op_roofline_stats(op_info)
- __get_per_op_stats(op_info, result)
- __add_per_datatype_tree(stats_tree, datatype)
- __merge_lower_stats_tree(upper_stats_tree, lower_stats_tree)
- __get_bytes_flops(stats_tree)
- __get_module_bytes_flops(module, result)
- __get_total_bytes_flops(module, result)
- handle_elementwise_binary(op_info)
- handle_clamp(op_info)
- handle_free_ops(op_info)
- handle_noflop_ops(op_info)
- handle_concatenate(op_info)
- handle_gather(op_info)
- handle_scatter(op_info)
- handle_convolution(op_info)
- handle_unary_elemwise(op_info)
- handle_select(op_info)
- handle_reduce(op_info)
- handle_reduce_window(op_info)
- handle_select_and_scatter(op_info)
- handle_sort(op_info)
Roofline model of an operator that, sorts 1-dimensional slices of inputs along a dimension together.
- handle_dot_general(op_info)
Calculate FLOPs for StableHLO dot_general operation.
FLOP Calculation: - For each output element, we perform a dot product across contracting dimensions - Each dot product involves: product(contracting_dims) multiply-add operations - Each multiply-add = 2 FLOPs (1 multiply + 1 add) - Total FLOPs = 2 x product(output_shape) xx product(contracting_dimension_sizes)
Example: Matrix multiplication A[M,K] x B[K,N] = C[M,N] - Output elements: M x N - Contracting dimension size: K - FLOPs = 2 x M x N x K
- handle_ragged_dot(op_info)
Ragged dot uses a new group dimension for the ragged dimension, for example in mode 1 the ragged dimension is M, it will be split into G groups each value of the group dimension tensor corresponding to how many rows of M go to each group. It has 3 modes each mode considers a different dimension as the ragged one (m, k, b) respectively. - if mode 1 apply dot_general roofline - if mode 2 assume average of k = K/G (this is the only case where ragged can reduce FLOPS) - if mode 3 assume all batches are used, so apply dot_general roofline Signatures for modes:
1 [b,m,k], [g,b,k,n], [b,g] -> [b,m,n]
2 [b,m,k], [b,k,n], [b,g] -> [g,b,m,n]
3 [b,m,k], [b,k,n], [g] -> [b,m,n]
- handle_slice_ops(op_info)
- handle_dynamic_update_slice(op_info)
- handle_fusion(op_info)
- handle_custom_call(op_info)
- _cache_hit_hooks = [<function ComputeEstimator.__count_cache_hits>, <function ComputeEstimator.__get_cached_module_times>, <function ComputeEstimator.__print_cached_runtime>]
- _cache_miss_hooks = []
- _default_op_handler(op_info: OpInfo) OpResult
This is a default for descended classes that an exception is thrown if the operator isn’t known. This can be overriden through the @register_default_op_handler decorator, but not directly. This should not be directly overidden or called manually.
- Parameters:
op_info – The operator to estimate the time of. This method will not estimate, and just throw an exception
- Returns:
The result of the estimator for this operator (to match the type of operator estimators - will not return)
- Raises:
InvalidOpError – Raises an InvalidOpError for any unknown operator
- _init_hooks = [<function RooflineEstimator.__setup_roofline_stats>]
- _metadata_hooks = []
- _module_metadata_hooks = []
- _op_handlers = {'func.call': <function RooflineEstimator.handle_free_ops>, 'func.return': <function RooflineEstimator.handle_free_ops>, 'mhlo.bitcast': <function RooflineEstimator.handle_noflop_ops>, 'mhlo.copy': <function RooflineEstimator.handle_noflop_ops>, 'mhlo.fusion': <function RooflineEstimator.handle_fusion>, 'mhlo.ragged_dot': <function RooflineEstimator.handle_ragged_dot>, 'mhlo.return': <function RooflineEstimator.handle_free_ops>, 'stablehlo.abs': <function RooflineEstimator.handle_unary_elemwise>, 'stablehlo.add': <function RooflineEstimator.handle_elementwise_binary>, 'stablehlo.and': <function RooflineEstimator.handle_elementwise_binary>, 'stablehlo.broadcast_in_dim': <function RooflineEstimator.handle_noflop_ops>, 'stablehlo.clamp': <function RooflineEstimator.handle_clamp>, 'stablehlo.compare': <function RooflineEstimator.handle_elementwise_binary>, 'stablehlo.complex': <function RooflineEstimator.handle_free_ops>, 'stablehlo.concatenate': <function RooflineEstimator.handle_concatenate>, 'stablehlo.constant': <function RooflineEstimator.handle_free_ops>, 'stablehlo.convert': <function RooflineEstimator.handle_unary_elemwise>, 'stablehlo.convolution': <function RooflineEstimator.handle_convolution>, 'stablehlo.cosine': <function RooflineEstimator.handle_unary_elemwise>, 'stablehlo.custom_call': <function RooflineEstimator.handle_custom_call>, 'stablehlo.divide': <function RooflineEstimator.handle_elementwise_binary>, 'stablehlo.dot': <function RooflineEstimator.handle_dot_general>, 'stablehlo.dot_general': <function RooflineEstimator.handle_dot_general>, 'stablehlo.dynamic_slice': <function RooflineEstimator.handle_slice_ops>, 'stablehlo.dynamic_update_slice': <function RooflineEstimator.handle_dynamic_update_slice>, 'stablehlo.exponential': <function RooflineEstimator.handle_unary_elemwise>, 'stablehlo.gather': <function RooflineEstimator.handle_gather>, 'stablehlo.get_tuple_element': <function RooflineEstimator.handle_free_ops>, 'stablehlo.imag': <function RooflineEstimator.handle_free_ops>, 'stablehlo.iota': <function RooflineEstimator.handle_free_ops>, 'stablehlo.is_finite': <function RooflineEstimator.handle_unary_elemwise>, 'stablehlo.log': <function RooflineEstimator.handle_unary_elemwise>, 'stablehlo.logistic': <function RooflineEstimator.handle_unary_elemwise>, 'stablehlo.maximum': <function RooflineEstimator.handle_elementwise_binary>, 'stablehlo.minimum': <function RooflineEstimator.handle_elementwise_binary>, 'stablehlo.multiply': <function RooflineEstimator.handle_elementwise_binary>, 'stablehlo.negate': <function RooflineEstimator.handle_unary_elemwise>, 'stablehlo.not': <function RooflineEstimator.handle_unary_elemwise>, 'stablehlo.optimization_barrier': <function RooflineEstimator.handle_free_ops>, 'stablehlo.or': <function RooflineEstimator.handle_elementwise_binary>, 'stablehlo.pad': <function RooflineEstimator.handle_noflop_ops>, 'stablehlo.partition_id': <function RooflineEstimator.handle_free_ops>, 'stablehlo.power': <function RooflineEstimator.handle_unary_elemwise>, 'stablehlo.real': <function RooflineEstimator.handle_free_ops>, 'stablehlo.reduce': <function RooflineEstimator.handle_reduce>, 'stablehlo.reduce_precision': <function RooflineEstimator.handle_noflop_ops>, 'stablehlo.reduce_window': <function RooflineEstimator.handle_reduce_window>, 'stablehlo.replica_id': <function RooflineEstimator.handle_free_ops>, 'stablehlo.reshape': <function RooflineEstimator.handle_noflop_ops>, 'stablehlo.return': <function RooflineEstimator.handle_free_ops>, 'stablehlo.reverse': <function RooflineEstimator.handle_noflop_ops>, 'stablehlo.round_nearest_even': <function RooflineEstimator.handle_unary_elemwise>, 'stablehlo.rsqrt': <function RooflineEstimator.handle_unary_elemwise>, 'stablehlo.scatter': <function RooflineEstimator.handle_scatter>, 'stablehlo.select': <function RooflineEstimator.handle_select>, 'stablehlo.select_and_scatter': <function RooflineEstimator.handle_select_and_scatter>, 'stablehlo.sign': <function RooflineEstimator.handle_unary_elemwise>, 'stablehlo.sine': <function RooflineEstimator.handle_unary_elemwise>, 'stablehlo.slice': <function RooflineEstimator.handle_slice_ops>, 'stablehlo.sort': <function RooflineEstimator.handle_sort>, 'stablehlo.sqrt': <function RooflineEstimator.handle_unary_elemwise>, 'stablehlo.subtract': <function RooflineEstimator.handle_elementwise_binary>, 'stablehlo.tanh': <function RooflineEstimator.handle_unary_elemwise>, 'stablehlo.transpose': <function RooflineEstimator.handle_noflop_ops>, 'stablehlo.xor': <function RooflineEstimator.handle_elementwise_binary>}
- _post_estimate_hooks = [<function ComputeEstimator.__setup_per_op_tree>, <function ComputeEstimator.__get_total_estimate_time>, <function ComputeEstimator.__get_total_runtime>, <function ComputeEstimator.__count_processed>, <function ComputeEstimator.__get_per_op_runtime>, <function RooflineEstimator.__get_total_bytes_flops>]
- _post_op_hooks = [<function ComputeEstimator.__get__module_op_times>, <function RooflineEstimator.__get_per_op_stats>]
- _post_run_hooks = [<function ComputeEstimator.__get_module_runtime>, <function ComputeEstimator.__module_run_end_time>, <function ComputeEstimator.__print_run_runtime>, <function RooflineEstimator.__get_module_bytes_flops>]
- _pre_estimate_hooks = [<function ComputeEstimator.__setup_per_module_stat_tree>, <function ComputeEstimator.__total_estimate_start_time>, <function ComputeEstimator.__print_start_line>, <function RooflineEstimator.__setup_module_roofline_stats>]
- _pre_op_hooks = [<function ComputeEstimator.__setup_per_op_stat_tree>, <function RooflineEstimator.__setup_per_op_roofline_stats>]
- _pre_run_hooks = [<function ComputeEstimator.__module_run_start_time>]
- bases_order = {'compute': 0}
- config_arguments = {'cache_dir': <hespas.estimator.config_option.ConfigOption object>, 'disable_cache': <hespas.estimator.config_option.ConfigOption object>, 'error_on_unknown_type': <hespas.estimator.config_option.ConfigOption object>, 'hbm_power_ratio': <hespas.estimator.config_option.ConfigOption object>, 'in_memory_only_cache': <hespas.estimator.config_option.ConfigOption object>, 'memory_bandwidth': <hespas.estimator.config_option.ConfigOption object>, 'num_npus': <hespas.estimator.config_option.ConfigOption object>, 'peak_flops': <hespas.estimator.config_option.ConfigOption object>, 'per_datatype_flops': <hespas.estimator.config_option.ConfigOption object>, 'tdp_W': <hespas.estimator.config_option.ConfigOption object>, 'type': <hespas.estimator.config_option.ConfigOption object>, 'warn_on_unknown_type': <hespas.estimator.config_option.ConfigOption object>}
- config_options = {'cache_dir': <hespas.estimator.config_option.ConfigOption object>, 'disable_cache': <hespas.estimator.config_option.ConfigOption object>, 'error_on_unknown_type': <hespas.estimator.config_option.ConfigOption object>, 'hbm_power_ratio': <hespas.estimator.config_option.ConfigOption object>, 'in_memory_only_cache': <hespas.estimator.config_option.ConfigOption object>, 'memory_bandwidth': <hespas.estimator.config_option.ConfigOption object>, 'num_npus': <hespas.estimator.config_option.ConfigOption object>, 'peak_flops': <hespas.estimator.config_option.ConfigOption object>, 'per_datatype_flops': <hespas.estimator.config_option.ConfigOption object>, 'tdp_W': <hespas.estimator.config_option.ConfigOption object>, 'type': <hespas.estimator.config_option.ConfigOption object>, 'warn_on_unknown_type': <hespas.estimator.config_option.ConfigOption object>}
- display_name = 'roofline'
- display_name_map = {'compute': <class 'hespas.estimator.compute_estimator.ComputeEstimator'>}