【计算机视觉】timm包实现EfficientNet

这篇具有很好参考价值的文章主要介绍了【计算机视觉】timm包实现EfficientNet。希望对大家有所帮助。如果存在错误或未考虑完全的地方，请大家不吝赐教，您也可以点击"举报违法"按钮提交疑问。

一、EfficientNet介绍

我们为 EfficientNet 系列模型提供实现和预训练权重。

Paper: EfficientNet: Rethinking Model Scaling for CNNs.

timm 的efficientnet_b5,计算机视觉,计算机视觉,人工智能,timm,EfficientNet,图像分类
此代码和权重已从 timm 实现移植。这确实意味着某些模型权重经历了从 TF（来自 Google Brain 团队的原始权重）到 PyTorch（timm 库）再回到 TF（timm 的 tfimm 端口）的旅程。

有以下型号可供选择。

二、模型的选择

2.1 MobileNet-V2 models

这些模型对应于 timm 中的 mobilenetv2_... 模型。

mobilenet_v2_{050, 100, 140}。这些是 MobileNet-V2 模型，通道乘数分别设置为 0.5、1.0 和 1.4。

mobilenet_v2_{110d，120d}。这些是 MobileNet-V2 模型，（通道、深度）乘数分别设置为 (1.1, 1.2) 和 (1.2, 1.4)。

2.2 Original EfficientNet models

这些模型对应于 timm 中的模型 tf_...。

efficientnet_{b0, b1, b2, b3, b4, b5, b6, b7, b8}

2.3 EfficientNet AdvProp 模型，使用对抗性示例进行训练

这些模型对应于 timm 中的 tf_... 模型。

efficientnet_{b0, ..., b8}_ap

2.4 EfficientNet NoisyStudent 模型，通过半监督学习进行训练

这些模型对应于 timm 中的 tf_... 模型。

efficientnet_{b0, ..., b7}_ns

efficientnet_l2_ns_475

efficientnet_l2

2.5 PyTorch versions of the EfficientNet models

这些模型使用对称填充，而不是 TF 中默认的“相同”填充。它们对应于 timm 中的 effectivenet_... 模型。

pt_efficientnet_{b0, ..., b4}

2.6 EfficientNet-EdgeTPU models, optimized for inference on Google’s Edge TPU hardware

These models correspond to the tf_... models in timm.

efficientnet_es

efficientnet_em

efficientnet_el

2.7 EfficientNet-Lite models, optimized for inference on mobile devices, CPUs and GPUs

These models correspond to the tf_... models in timm.

efficientnet_lite0

efficientnet_lite1

efficientnet_lite2

efficientnet_lite3

efficientnet_lite4

2.8 EfficientNet-V2 models

These models correspond to the tf_... models in timm.

efficientnet_v2_b0

efficientnet_v2_b1

efficientnet_v2_b2

efficientnet_v2_b3

efficientnet_v2_s

efficientnet_v2_m

efficientnet_v2_l

2.9 EfficientNet-V2 models, pretrained on ImageNet-21k, fine-tuned on ImageNet-1k

这些模型对应于 timm 中的 tf_... 模型。

efficientnet_v2_s_in21ft1k

efficientnet_v2_m_in21ft1k

efficientnet_v2_l_in21ft1k

efficientnet_v2_xl_in21ft1k

2.10 EfficientNet-V2 models, pretrained on ImageNet-21k

这些模型对应于 timm 中的 tf_... 模型。

efficientnet_v2_s_in21k

efficientnet_v2_m_in21k

efficientnet_v2_l_in21k

efficientnet_v2_xl_in21k

三、EfficientNetConfig

classEfficientNetConfig(name='', url='', nb_classes=1000, in_channels=3, input_size=(224, 224), stem_size=32, architecture=(), channel_multiplier=1.0, depth_multiplier=1.0, fix_first_last=False, nb_features=1280, drop_rate=0.0, drop_path_rate=0.0, norm_layer='batch_norm', act_layer='swish', padding='symmetric', crop_pct=0.875, interpolation='bicubic', mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225), first_conv='conv_stem', classifier='classifier')

EfficientNet 模型的配置类。

Parameters:
name (str) – Name of the model.

url (str) – URL for pretrained weights.

nb_classes (int) – Number of classes for classification head.

in_channels (int) – Number of input image channels.

input_size (Tuple[int, int]) – Input image size (height, width)

stem_size (int) – Number of filters in first convolution.

architecture (Tuple[Tuple[str, ...], ...]) – Tuple of tuple of strings defining the architecture of residual blocks. The outer tuple defines the stages while the inner tuple defines the blocks per stage.

channel_multiplier (float) – Multiplier for channel scaling. One of the three dimensions of EfficientNet scaling.

depth_multiplier (float) – Multiplier for depth scaling. One of the three dimensions of EfficientNet scaling.

fix_first_last (bool) – Fix first and last block depths when multiplier is applied.

nb_features (int) – Number of features before the classifier layer.

drop_rate (float) – Dropout rate.

drop_path_rate (float) – Dropout rate for stochastic depth.

norm_layer (str) – Normalization layer. See norm_layer_factory() for possible values.

act_layer (str) – Activation function. See act_layer_factory() for possible values.

padding (str) – Type of padding to use for convolutional layers. Can be one of “same”, “valid” or “symmetric” (PyTorch-style symmetric padding).

crop_pct (float) – Crop percentage for ImageNet evaluation.

interpolation (str) – Interpolation method for ImageNet evaluation.

mean (Tuple[float, float, float]) – Defines preprocessing function. If x is an image with pixel values in (0, 1), the preprocessing function is (x - mean) / std.

std (Tuple[float, float, float]) – Defines preprpocessing function.

first_conv (str) – Name of first convolutional layer. Used by create_model() to adapt the number in input channels when loading pretrained weights.

classifier (str) – Name of classifier layer. Used by create_model() to adapt the classifier when loading pretrained weights.

四、EfficientNet

classEfficientNet(*args, **kwargs)

支持深度和宽度缩放以及灵活的架构定义的通用 EfficientNet 实现，包括：EfficientNet B0-B7

Parameters:
cfg (EfficientNetConfig) – Configuration class for the model.

**kwargs – Arguments are passed to tf.keras.Model.

call(x, training=False, return_features=False)

前向传递整个模型。

Parameters:
x – Input to model

training (bool) – Training or inference phase?

return_features (bool) – If True, we return not only the model output, but a dictionary with intermediate features.

Returns:
If return_features=True, we return a tuple (y, features), where y is the model output and features is a dictionary with intermediate features.

If return_features=False, we return only y.

propertydummy_inputs: Tensor

返回正确形状的张量以进行推理。

propertyfeature_names: List[str]

功能名称，在调用 return_features=True 时返回。

forward_features(x, training=False, return_features=False)

前向传递模型，不包括分类器层。如果模型用作下游任务（例如对象检测）的输入，则此函数非常有用。文章来源地址https://www.toymoban.com/news/detail-778202.html

Parameters:
x – Input to model

training (bool) – Training or inference phase?

return_features (bool) – If True, we return not only the model output, but a dictionary with intermediate features.

Returns:
If return_features=True, we return a tuple (y, features), where y is the model output and features is a dictionary with intermediate features.

If return_features=False, we return only y.