当前位置: 首页 > news >正文

做公司网站需要有座机吗在线是免费生成网

做公司网站需要有座机吗,在线是免费生成网,小程序开发平台排名,语音app开发提示:GPU-manager安装为主部分内容做了升级开箱即用,有用请点收藏❤抱拳 文章目录 前言一、约束条件二、使用步骤1.下载镜像1.1 查看当前虚拟机的驱动类型: 2.部署gpu-manager3.部署gpu-admission4.修改kube-scheduler.yaml![在这里插入图片描…

提示:GPU-manager安装为主部分内容做了升级开箱即用,有用请点收藏❤抱拳

文章目录

  • 前言
  • 一、约束条件
  • 二、使用步骤
    • 1.下载镜像
        • 1.1 查看当前虚拟机的驱动类型:
    • 2.部署gpu-manager
    • 3.部署gpu-admission
    • 4.修改kube-scheduler.yaml![在这里插入图片描述](https://img-blog.csdnimg.cn/f76aab5b5bfc4f9c80eb4cd4721efeee.png)
      • 4.1 新建/etc/kubernetes/scheduler-policy-config.json
      • 4.2 新建/etc/kubernetes/scheduler-extender.yaml
      • 4.3 修改/etc/kubernetes/manifests/kube-scheduler.yaml
    • 4.1 结果查看
    • 测试
  • 总结


前言

本文只做开箱即用部分,想了解GPUManager虚拟化方案技术层面请直接点击:GPUmanager虚拟化方案


一、约束条件

1、虚拟机需要完成直通式绑定,也就是物理GPU与虚拟机绑定,我做的是hyper-v的虚拟机绑定参照上一篇文章
2、对于k8s要求1.10版本以上
3、GPU-Manager 要求集群内包含 GPU 机型节点
4、每张 GPU 卡一共有100个单位的资源,仅支持0 - 1的小数卡,以及1的倍数的整数卡设置。显存资源是以256MiB为最小的一个单位的分配显存
我的版本:k8s-1.20

二、使用步骤

1.下载镜像

镜像地址:https://hub.docker.com/r/tkestack/gpu-manager/tags
manager:docker pull tkestack/gpu-manager:v1.1.5
https://hub.docker.com/r/tkestack/gpu-quota-admission/tags
admission:docker pull tkestack/gpu-quota-admission:v1.0.0

1.1 查看当前虚拟机的驱动类型:

docker info 

在这里插入图片描述

2.部署gpu-manager

拥有GPU节点打标签:

kubectl label node XX nvidia-device-enable=enable

如果docker驱动是systemd 需要在yaml指定,因为GPUmanager默认cgroupfs
在这里插入图片描述
创建yaml内容如下:

apiVersion: v1
kind: ServiceAccount
metadata:name: gpu-managernamespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:name: gpu-manager-role
subjects:- kind: ServiceAccountname: gpu-managernamespace: kube-system
roleRef:kind: ClusterRolename: cluster-adminapiGroup: rbac.authorization.k8s.io
---
apiVersion: apps/v1
kind: DaemonSet
metadata:name: gpu-manager-daemonsetnamespace: kube-system
spec:updateStrategy:type: RollingUpdateselector:matchLabels:name: gpu-manager-dstemplate:metadata:# This annotation is deprecated. Kept here for backward compatibility# See https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/annotations:scheduler.alpha.kubernetes.io/critical-pod: ""labels:name: gpu-manager-dsspec:serviceAccount: gpu-managertolerations:# This toleration is deprecated. Kept here for backward compatibility# See https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/- key: CriticalAddonsOnlyoperator: Exists- key: tencent.com/vcuda-coreoperator: Existseffect: NoSchedule# Mark this pod as a critical add-on; when enabled, the critical add-on# scheduler reserves resources for critical add-on pods so that they can# be rescheduled after a failure.# See https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/priorityClassName: "system-node-critical"# only run node has gpu devicenodeSelector:nvidia-device-enable: enablehostPID: truecontainers:- image: tkestack/gpu-manager:v1.1.5imagePullPolicy: IfNotPresentname: gpu-managersecurityContext:privileged: trueports:- containerPort: 5678volumeMounts:- name: device-pluginmountPath: /var/lib/kubelet/device-plugins- name: vdrivermountPath: /etc/gpu-manager/vdriver- name: vmdatamountPath: /etc/gpu-manager/vm- name: logmountPath: /var/log/gpu-manager- name: checkpointmountPath: /etc/gpu-manager/checkpoint- name: run-dirmountPath: /var/run- name: cgroupmountPath: /sys/fs/cgroupreadOnly: true- name: usr-directorymountPath: /usr/local/hostreadOnly: true- name: kube-rootmountPath: /root/.kubereadOnly: trueenv:- name: LOG_LEVELvalue: "4"- name: EXTRA_FLAGSvalue: "--cgroup-driver=systemd"- name: NODE_NAMEvalueFrom:fieldRef:fieldPath: spec.nodeNamevolumes:- name: device-pluginhostPath:type: Directorypath: /var/lib/kubelet/device-plugins- name: vmdatahostPath:type: DirectoryOrCreatepath: /etc/gpu-manager/vm- name: vdriverhostPath:type: DirectoryOrCreatepath: /etc/gpu-manager/vdriver- name: loghostPath:type: DirectoryOrCreatepath: /etc/gpu-manager/log- name: checkpointhostPath:type: DirectoryOrCreatepath: /etc/gpu-manager/checkpoint# We have to mount the whole /var/run directory into container, because of bind mount docker.sock# inode change after host docker is restarted- name: run-dirhostPath:type: Directorypath: /var/run- name: cgrouphostPath:type: Directorypath: /sys/fs/cgroup# We have to mount /usr directory instead of specified library path, because of non-existing# problem for different distro- name: usr-directoryhostPath:type: Directorypath: /usr- name: kube-roothostPath:type: Directorypath: /root/.kube

执行yaml文件:

kubectl apply -f gpu-manager.yaml
kubectl get pod -A|grep  gpu 查询结果

3.部署gpu-admission

创建yaml内容如下:

apiVersion: v1
kind: ServiceAccount
metadata:name: gpu-admissionnamespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:name: gpu-admission-as-kube-scheduler
subjects:- kind: ServiceAccountname: gpu-admissionnamespace: kube-system
roleRef:kind: ClusterRolename: system:kube-schedulerapiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:name: gpu-admission-as-volume-scheduler
subjects:- kind: ServiceAccountname: gpu-admissionnamespace: kube-system
roleRef:kind: ClusterRolename: system:volume-schedulerapiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:name: gpu-admission-as-daemon-set-controller
subjects:- kind: ServiceAccountname: gpu-admissionnamespace: kube-system
roleRef:kind: ClusterRolename: system:controller:daemon-set-controllerapiGroup: rbac.authorization.k8s.io
---
apiVersion: apps/v1
kind: Deployment
metadata:labels:component: schedulertier: control-planeapp: gpu-admissionname: gpu-admissionnamespace: kube-system
spec:selector:matchLabels:component: schedulertier: control-planereplicas: 1template:metadata:labels:component: schedulertier: control-planeversion: secondspec:serviceAccountName: gpu-admissioncontainers:- image: thomassong/gpu-admission:47d56ae9name: gpu-admissionenv:- name: LOG_LEVELvalue: "4"ports:- containerPort: 3456dnsPolicy: ClusterFirstWithHostNethostNetwork: truepriority: 2000000000priorityClassName: system-cluster-critical
---
apiVersion: v1
kind: Service
metadata:name: gpu-admissionnamespace: kube-system
spec:ports:- port: 3456protocol: TCPtargetPort: 3456selector:app: gpu-admissiontype: ClusterIP

执行yaml文件:

kubectl create -f gpu-admission.yaml
kubectl get pod -A|grep  gpu 查询结果

4.修改kube-scheduler.yaml在这里插入图片描述

4.1 新建/etc/kubernetes/scheduler-policy-config.json

创建内容:

vim /etc/kubernetes/scheduler-policy-config.json
复制如下内容:
{"kind": "Policy","apiVersion": "v1","predicates": [{"name": "PodFitsHostPorts"},{"name": "PodFitsResources"},{"name": "NoDiskConflict"},{"name": "MatchNodeSelector"},{"name": "HostName"}],"priorities": [{"name": "BalancedResourceAllocation","weight": 1},{"name": "ServiceSpreadingPriority","weight": 1}],"extenders": [{"urlPrefix": "http://gpu-admission.kube-system:3456/scheduler","apiVersion": "v1beta1","filterVerb": "predicates","enableHttps": false,"nodeCacheCapable": false}],"hardPodAffinitySymmetricWeight": 10,"alwaysCheckAllPredicates": false
}

4.2 新建/etc/kubernetes/scheduler-extender.yaml

创建内容:

vim /etc/kubernetes/scheduler-extender.yaml
复制如下内容:
apiVersion: kubescheduler.config.k8s.io/v1alpha1
kind: KubeSchedulerConfiguration
clientConnection:kubeconfig: "/etc/kubernetes/scheduler.conf"
algorithmSource:policy:file:path: "/etc/kubernetes/scheduler-policy-config.json"

4.3 修改/etc/kubernetes/manifests/kube-scheduler.yaml

修改内容:

vim /etc/kubernetes/manifests/kube-scheduler.yaml
复制如下内容:
apiVersion: v1
kind: Pod
metadata:creationTimestamp: nulllabels:component: kube-schedulertier: control-planename: kube-schedulernamespace: kube-system
spec:containers:- command:- kube-scheduler- --authentication-kubeconfig=/etc/kubernetes/scheduler.conf- --authorization-kubeconfig=/etc/kubernetes/scheduler.conf- --bind-address=0.0.0.0- --feature-gates=TTLAfterFinished=true,ExpandCSIVolumes=true,CSIStorageCapacity=true,RotateKubeletServerCertificate=true- --kubeconfig=/etc/kubernetes/scheduler.conf- --leader-elect=true- --port=0- --config=/etc/kubernetes/scheduler-extender.yamlimage: registry.cn-beijing.aliyuncs.com/kubesphereio/kube-scheduler:v1.22.10imagePullPolicy: IfNotPresentlivenessProbe:failureThreshold: 8httpGet:path: /healthzport: 10259scheme: HTTPSinitialDelaySeconds: 10periodSeconds: 10timeoutSeconds: 15name: kube-schedulerresources:requests:cpu: 100mstartupProbe:failureThreshold: 24httpGet:path: /healthzport: 10259scheme: HTTPSinitialDelaySeconds: 10periodSeconds: 10timeoutSeconds: 15volumeMounts:- mountPath: /etc/kubernetes/scheduler.confname: kubeconfigreadOnly: true- mountPath: /etc/localtimename: localtimereadOnly: true- mountPath: /etc/kubernetes/scheduler-extender.yamlname: extenderreadOnly: true- mountPath: /etc/kubernetes/scheduler-policy-config.jsonname: extender-policyreadOnly: truehostNetwork: truepriorityClassName: system-node-criticalsecurityContext:seccompProfile:type: RuntimeDefaultvolumes:- hostPath:path: /etc/kubernetes/scheduler.conftype: FileOrCreatename: kubeconfig- hostPath:path: /etc/localtimetype: Filename: localtime- hostPath:path: /etc/kubernetes/scheduler-extender.yamltype: FileOrCreatename: extender- hostPath:path: /etc/kubernetes/scheduler-policy-config.jsontype: FileOrCreatename: extender-policy
status: {}

修改内容入下:
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
修改完成k8s自动重启,如果没有重启执行 kubectl delete pod -n [podname]

4.1 结果查看

执行命令:

 kubectl describe node master[节点名称]

在这里插入图片描述

测试

镜像下载:docker pull gaozhenhai/tensorflow-gputest:0.2
创建yaml内容: vim vcuda-test.yaml

apiVersion: apps/v1
kind: Deployment
metadata:labels:k8s-app: vcuda-testqcloud-app: vcuda-testname: vcuda-testnamespace: default
spec:replicas: 1selector:matchLabels:k8s-app: vcuda-testtemplate:metadata:labels:k8s-app: vcuda-testqcloud-app: vcuda-testspec:containers:- command:- sleep- 360000senv:- name: PATHvalue: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/binimage: gaozhenhai/tensorflow-gputest:0.2imagePullPolicy: IfNotPresentname: tensorflow-testresources:limits:cpu: "4"memory: 8Gitencent.com/vcuda-core: "50"tencent.com/vcuda-memory: "32"requests:cpu: "4"memory: 8Gitencent.com/vcuda-core: "50"tencent.com/vcuda-memory: "32"

启动yaml:kubectl apply -f vcuda-test.yaml
进入容器:

kubectl exec -it `kubectl get pods -o name | cut -d '/' -f2` -- bash

执行测试命令:

cd /data/tensorflow/cifar10 && time python cifar10_train.py

查看结果:

执行命令:nvidia-smi pmon -s u -d 1、命令查看GPU资源使用情况

总结

到此vgpu容器层虚拟化全部完成

http://www.yayakq.cn/news/104973/

相关文章:

  • 上传图片做网站维护网站安全监测预警平台建设成效
  • 帝国cms调用网站地址网站开发与应用 答案
  • 做摄影网站wordpress 多语言插件哪个好
  • 个人在网站怎么做什么是公司注册资金
  • 什么是做学院网站温州市城乡建设信息港
  • asp做的网站缺点wordpress指定分类名称
  • 公众号平台网站开发seo优化sem推广
  • 建设工程法律网站自己做的网站怎么排名
  • 房地产开发网站建设河北邯郸做网站
  • 领导交给你一个网站你该怎么做厦门小鱼网
  • ps 怎么做网站搜索框网站建设找推推蛙
  • 郴州网站建设网站企业信息公示系统年报
  • 网站内建设的发展做网站赚钱么
  • 北京做招聘网站的公司百度推广充值必须5000吗
  • 365网站建设网站规划的主要任务是什么
  • 京东网站建设项目需求分析报告为什么资讯网站荣誉被收录
  • 饮食网站建设策划书哪个网站做外贸的
  • 政务网站的建设原则acaa网页设计师
  • 自己做网站 服务器网站模板 带手机端
  • 电子商务网站开发教程做网站要的软件
  • 东莞整站优化第一次跑业务怎么找客户
  • 国家企业官方网站查询系统广东省交通建设监理检测协会网站
  • 把网站做进微信公众号有哪些网站可以免费做外销
  • 萧山好的做网站的公司成都小型软件开发公司
  • 论文旅游网站建设设置wordpress文章标题高亮的代码
  • 响应式网站自助建设青岛网页设计招聘
  • 网站建设中面包屑导航的特点网站制作价格范围
  • 网站正在建设中英文做曖視頻网站
  • php网站开发技术训练心得网站建设中gif
  • 整站seo免费咨询服务好的网站开发